KEMBAR78
MongoDB Indexes Guide | PDF | Database Index | Search Engine Indexing
0% found this document useful (0 votes)
481 views68 pages

MongoDB Indexes Guide

This document provides an overview of indexes in MongoDB, including the different types of indexes, their properties, how to create and manage them. It covers default, single-field, compound, multikey, geospatial, text, and hashed indexes. It also discusses background index building, unique indexes, and sparse indexes. Tutorial sections provide examples on creating, removing, and rebuilding indexes.

Uploaded by

lokeshgupta15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
481 views68 pages

MongoDB Indexes Guide

This document provides an overview of indexes in MongoDB, including the different types of indexes, their properties, how to create and manage them. It covers default, single-field, compound, multikey, geospatial, text, and hashed indexes. It also discusses background index building, unique indexes, and sparse indexes. Tutorial sections provide examples on creating, removing, and rebuilding indexes.

Uploaded by

lokeshgupta15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 68

Indexes and MongoDB

Release 2.6.0
MongoDB Documentation Project
April 15, 2014
Contents
1 Index Introduction 3
1.1 Index Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Default _id . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Single Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Compound Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Multikey Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Geospatial Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Text Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Hashed Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Index Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Unique Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Sparse Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Index Intersection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Index Concepts 8
2.1 Index Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Behavior of Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Index Type Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Index Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
TTL Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Unique Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Sparse Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3 Index Creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Background Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Drop Duplicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Index Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4 Index Intersection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Index Prex Intersection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Index Intersection and Compound Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Index Intersection and Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3 Indexing Tutorials 32
3.1 Index Creation Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Create an Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Create a Compound Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Create a Unique Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Create a Sparse Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Create a Hashed Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Build Indexes on Replica Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Build Indexes in the Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Build Old Style Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2 Index Management Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Remove Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Rebuild Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Manage In-Progress Index Creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Return a List of All Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Measure Index Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.3 Geospatial Index Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Create a 2dsphere Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Query a 2dsphere Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Create a 2d Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Query a 2d Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Create a Haystack Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Query a Haystack Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Calculate Distance Using Spherical Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.4 Text Search Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Create a text Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Specify a Language for Text Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Create text Index with Long Name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Control Search Results with Weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Limit the Number of Entries Scanned . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Text Search in the Aggregation Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.5 Indexing Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Create Indexes to Support Your Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Use Indexes to Sort Query Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Ensure Indexes Fit in RAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Create Queries that Ensure Selectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4 Indexing Reference 66
4.1 Indexing Methods in the mongo Shell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.2 Indexing Database Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.3 Geospatial Query Selectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.4 Indexing Query Modiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.5 Other Index References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Text Search Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Index 68
Indexes provide high performance read operations for frequently used queries.
This section introduces indexes in MongoDB, describes the types and conguration options for indexes, and describes
special types of indexing MongoDB supports. The section also provides tutorials detailing procedures and operational
concerns, and providing information on how applications may use indexes.
Index Introduction (page 3) An introduction to indexes in MongoDB.
Index Concepts (page 8) The core documentation of indexes in MongoDB, including geospatial and text indexes.
Index Types (page 8) MongoDB provides different types of indexes for different purposes and different types
of content.
2
Index Properties (page 25) The properties you can specify when building indexes.
Index Creation (page 28) The options available when creating indexes.
Index Intersection (page 31) The use of index intersection to fulll a query.
Indexing Tutorials (page 32) Examples of operations involving indexes, including index creation and querying in-
dexes.
Indexing Reference (page 66) Reference material for indexes in MongoDB.
1 Index Introduction
Indexes support the efcient execution of queries in MongoDB. Without indexes MongoDB must scan every document
in a collection to select those documents that match the query statement. These collection scans are inefcient because
they require mongod to process a larger volume of data than an index for each operation.
Indexes are special data structures
1
that store a small portion of the collections data set in an easy to traverse form.
The index stores the value of a specic eld or set of elds, ordered by the value of the eld.
Fundamentally, indexes in MongoDB are similar to indexes in other database systems. MongoDB denes indexes at
the collection level and supports indexes on any eld or sub-eld of the documents in a MongoDB collection.
If an appropriate index exists for a query, MongoDB can use the index to limit the number of documents it must
inspect. In some cases, MongoDB can use the data from the index to determine which documents match a query. The
following diagram illustrates a query that selects documents using an index.
Figure 1: Diagram of a query selecting documents using an index. MongoDB narrows the query by scanning the range
of documents with values of score less than 30.
Consider the documentation of the query optimizer for more information on the relationship between queries and
indexes.
1
MongoDB indexes use a B-tree data structure.
3
Tip
Create indexes to support common and user-facing queries. Having these indexes will ensure that MongoDB only
scans the smallest possible number of documents.
Indexes can also optimize the performance of other operations in specic situations:
Sorted Results
MongoDB can use indexes to return documents sorted by the index key directly from the index without requiring an
additional sort phase. Covered Results
Figure 2: Diagram of a query that uses an index to select and return sorted results. The index stores score values in
ascending order. MongoDB can traverse the index in either ascending or descending order to return sorted results.
When the query criteria and the projection of a query include only the indexed elds, MongoDB will return results
directly from the index without scanning any documents or bringing documents into memory. These covered queries
can be very efcient. Indexes can also cover aggregation pipeline operations.
Figure 3: Diagram of a query that uses only the index to match the query criteria and return the results. MongoDB
does not need to inspect data outside of the index to fulll the query.
4
1.1 Index Types
MongoDB provides a number of different index types to support specic types of data and queries.
Default _id
All MongoDB collections have an index on the _id eld that exists by default. If applications do not specify a value
for _id the driver or the mongod will create an _id eld with an ObjectId value.
The _id index is unique, and prevents clients from inserting two documents with the same value for the _id eld.
Single Field
In addition to the MongoDB-dened _id index, MongoDB supports user-dened indexes on a single eld of a docu-
ment (page 9). Consider the following illustration of a single-eld index:
Figure 4: Diagram of an index on the score eld (ascending).
Compound Index
MongoDB also supports user-dened indexes on multiple elds. These compound indexes (page 11) behave like
single-eld indexes; however, the query can select documents based on additional elds. The order of elds listed
in a compound index has signicance. For instance, if a compound index consists of { userid: 1, score:
-1 }, the index sorts rst by userid and then, within each userid value, sort by score. Consider the following
illustration of this compound index:
Multikey Index
MongoDB uses multikey indexes (page 13) to index the content stored in arrays. If you index a eld that holds an
array value, MongoDB creates separate index entries for every element of the array. These multikey indexes (page 13)
allow queries to select documents that contain arrays by matching on element or elements of the arrays. MongoDB
automatically determines whether to create a multikey index if the indexed eld contains an array value; you do not
need to explicitly specify the multikey type.
Consider the following illustration of a multikey index:
5
Figure 5: Diagram of a compound index on the userid eld (ascending) and the score eld (descending). The
index sorts rst by the userid eld and then by the score eld.
Figure 6: Diagram of a multikey index on the addr.zip eld. The addr eld contains an array of address docu-
ments. The address documents contain the zip eld.
6
Geospatial Index
To support efcient queries of geospatial coordinate data, MongoDB provides two special indexes: 2d indexes
(page 21) that uses planar geometry when returning results and 2sphere indexes (page 18) that use spherical geometry
to return results.
See 2d Index Internals (page 22) for a high level introduction to geospatial indexes.
Text Indexes
MongoDB provides a beta text index type that supports searching for string content in a collection. These text
indexes do not store language-specic stop words (e.g. the, a, or) and stem the words in a collection to only
store root words.
See Text Indexes (page 23) for more information on text indexes and search.
Hashed Indexes
To support hash based sharding, MongoDB provides a hashed index (page 24) type, which indexes the hash of the
value of a eld. These indexes have a more random distribution of values along their range, but only support equality
matches and cannot support range-based queries.
1.2 Index Properties
Unique Indexes
The unique (page 26) property for an index causes MongoDB to reject duplicate values for the indexed eld. To create
a unique index (page 26) on a eld that already has duplicate values, see Drop Duplicates (page 30) for index creation
options. Other than the unique constraint, unique indexes are functionally interchangeable with other MongoDB
indexes.
Sparse Indexes
The sparse (page 26) property of an index ensures that the index only contain entries for documents that have the
indexed eld. The index skips documents that do not have the indexed eld.
You can combine the sparse index option with the unique index option to reject documents that have duplicate values
for a eld but ignore documents that do not have the indexed key.
1.3 Index Intersection
New in version 2.6.
MongoDB can use the intersection of indexes (page 31) to fulll queries. For queries that specify compound query
conditions, if one index can fulll a part of a query condition, and another index can fulll another part of the query
condition, then MongoDB can use the intersection of the two indexes to fulll the query. Whether the use of a
compound index or the use of an index intersection is more efcient depends on the particular query and the system.
For details on index intersection, see Index Intersection (page 31).
7
2 Index Concepts
These documents describe and provide examples of the types, conguration options, and behavior of indexes in Mon-
goDB. For an over view of indexing, see Index Introduction (page 3). For operational instructions, see Indexing
Tutorials (page 32). The Indexing Reference (page 66) documents the commands and operations specic to index
construction, maintenance, and querying in MongoDB, including index types and creation options.
Index Types (page 8) MongoDB provides different types of indexes for different purposes and different types of
content.
Single Field Indexes (page 9) A single eld index only includes data from a single eld of the documents in a
collection. MongoDB supports single eld indexes on elds at the top level of a document and on elds
in sub-documents.
Compound Indexes (page 11) Acompound index includes more than one eld of the documents in a collection.
Multikey Indexes (page 13) A multikey index references an array and records a match if a query includes any
value in the array.
Geospatial Indexes and Queries (page 16) Geospatial indexes support location-based searches on data that is
stored as either GeoJSON objects or legacy coordinate pairs.
Text Indexes (page 23) Text indexes supports search of string content in documents.
Hashed Index (page 24) Hashed indexes maintain entries with hashes of the values of the indexed eld.
Index Properties (page 25) The properties you can specify when building indexes.
TTL Indexes (page 25) The TTL index is used for TTL collections, which expire data after a period of time.
Unique Indexes (page 26) A unique index causes MongoDB to reject all documents that contain a duplicate
value for the indexed eld.
Sparse Indexes (page 26) A sparse index does not index documents that do not have the indexed eld.
Index Creation (page 28) The options available when creating indexes.
Index Intersection (page 31) The use of index intersection to fulll a query.
2.1 Index Types
MongoDB provides a number of different index types. You can create indexes on any eld or embedded eld within
a document or sub-document. You can create single eld indexes (page 9) or compound indexes (page 11). MongoDB
also supports indexes of arrays, called multi-key indexes (page 13), as well as supports indexes on geospatial data
(page 16). For a list of the supported index types, see Index Type Documentation (page 9).
In general, you should create indexes that support your common and user-facing queries. Having these indexes will
ensure that MongoDB scans the smallest possible number of documents.
In the mongo shell, you can create an index by calling the ensureIndex() method. For more detailed instructions
about building indexes, see the Indexing Tutorials (page 32) page.
Behavior of Indexes
All indexes in MongoDB are B-tree indexes, which can efciently support equality matches and range queries. The
index stores items internally in order sorted by the value of the index eld. The ordering of index entries supports
efcient range-based operations and allows MongoDB to return sorted results using the order of documents in the
index.
8
Ordering of Indexes
MongoDB indexes may be ascending, (i.e. 1) or descending (i.e. -1) in their ordering. Nevertheless, MongoDB may
also traverse the index in either directions. As a result, for single-eld indexes, ascending and descending indexes are
interchangeable. This is not the case for compound indexes: in compound indexes, the direction of the sort order can
have a greater impact on the results.
See Sort Order (page 12) for more information on the impact of index order on results in compound indexes.
Index Intersection
MongoDB can use the intersection of indexes to fulll queries with compound conditions. See Index Intersection
(page 31) for details.
Limits
Certain restrictions apply to indexes, such as the length of the index keys or the number of indexes per collection. See
Index Limitations for details.
Index Type Documentation
Single Field Indexes (page 9) A single eld index only includes data from a single eld of the documents in a col-
lection. MongoDB supports single eld indexes on elds at the top level of a document and on elds in sub-
documents.
Compound Indexes (page 11) A compound index includes more than one eld of the documents in a collection.
Multikey Indexes (page 13) A multikey index references an array and records a match if a query includes any value
in the array.
Geospatial Indexes and Queries (page 16) Geospatial indexes support location-based searches on data that is stored
as either GeoJSON objects or legacy coordinate pairs.
Text Indexes (page 23) Text indexes supports search of string content in documents.
Hashed Index (page 24) Hashed indexes maintain entries with hashes of the values of the indexed eld.
Single Field Indexes
MongoDB provides complete support for indexes on any eld in a collection of documents. By default, all collections
have an index on the _id eld (page 10), and applications and users may add additional indexes to support important
queries and operations.
MongoDB supports indexes that contain either a single eld or multiple elds depending on the operations that this
index-type supports. This document describes indexes that contain a single eld. Consider the following illustration
of a single eld index.
See also:
Compound Indexes (page 11) for information about indexes that include multiple elds, and Index Introduction
(page 3) for a higher level introduction to indexing in MongoDB.
9
Figure 7: Diagram of an index on the score eld (ascending).
Example Given the following document in the friends collection:
{ "_id" : ObjectId(...),
"name" : "Alice"
"age" : 27
}
The following command creates an index on the name eld:
db.friends.ensureIndex( { "name" : 1 } )
Cases
_id Field Index MongoDB creates the _id index, which is an ascending unique index (page 26) on the _id eld,
for all collections when the collection is created. You cannot remove the index on the _id eld.
Think of the _id eld as the primary key for a collection. Every document must have a unique _id eld. You may
store any unique value in the _id eld. The default value of _id is an ObjectId which is generated when the client
inserts the document. An ObjectId is a 12-byte unique identier suitable for use as the value of an _id eld.
Note: In sharded clusters, if you do not use the _id eld as the shard key, then your application must ensure the
uniqueness of the values in the _id eld to prevent errors. This is most-often done by using a standard auto-generated
ObjectId.
Before version 2.2, capped collections did not have an _id eld. In version 2.2 and newer, capped collections do have
an _id eld, except those in the local database. See Capped Collections Recommendations and Restrictions for
more information.
Indexes on Embedded Fields You can create indexes on elds embedded in sub-documents, just as you can index
top-level elds in documents. Indexes on embedded elds differ from indexes on sub-documents (page 11), which
include the full content up to the maximum index size of the sub-document in the index. Instead, indexes on
embedded elds allow you to use a dot notation, to introspect into sub-documents.
Consider a collection named people that holds documents that resemble the following example document:
10
{"_id": ObjectId(...)
"name": "John Doe"
"address": {
"street": "Main",
"zipcode": "53511",
"state": "WI"
}
}
You can create an index on the address.zipcode eld, using the following specication:
db.people.ensureIndex( { "address.zipcode": 1 } )
Indexes on Subdocuments You can also create indexes on subdocuments.
For example, the factories collection contains documents that contain a metro eld, such as:
{
_id: ObjectId(...),
metro: {
city: "New York",
state: "NY"
},
name: "Giant Factory"
}
The metro eld is a subdocument, containing the embedded elds city and state. The following command
creates an index on the metro eld as a whole:
db.factories.ensureIndex( { metro: 1 } )
The following query can use the index on the metro eld:
db.factories.find( { metro: { city: "New York", state: "NY" } } )
This query returns the above document. When performing equality matches on subdocuments, eld order matters and
the subdocuments must match exactly. For example, the following query does not match the above document:
db.factories.find( { metro: { state: "NY", city: "New York" } } )
See query-subdocuments for more information regarding querying on subdocuments.
Compound Indexes
MongoDB supports compound indexes, where a single index structure holds references to multiple elds
2
within a
collections documents. The following diagram illustrates an example of a compound index on two elds:
Compound indexes can support queries that match on multiple elds.
Example
Consider a collection named products that holds documents that resemble the following document:
{
"_id": ObjectId(...),
"item": "Banana",
"category": ["food", "produce", "grocery"],
2
MongoDB imposes a limit of 31 fields for any compound index.
11
Figure 8: Diagram of a compound index on the userid eld (ascending) and the score eld (descending). The
index sorts rst by the userid eld and then by the score eld.
"location": "4th Street Store",
"stock": 4,
"type": "cases",
"arrival": Date(...)
}
If applications query on the item eld as well as query on both the item eld and the stock eld, you can specify
a single compound index to support both of these queries:
db.products.ensureIndex( { "item": 1, "stock": 1 } )
Important: You may not create compound indexes that have hashed index elds. You will receive an error if you
attempt to create a compound index that includes a hashed index (page 24).
The order of the elds in a compound index is very important. In the previous example, the index will contain
references to documents sorted rst by the values of the item eld and, within each value of the item eld, sorted
by values of the stock eld. See Sort Order (page 12) for more information.
In addition to supporting queries that match on all the index elds, compound indexes can support queries that match
on the prex of the index elds. For details, see Prexes (page 13).
Sort Order Indexes store references to elds in either ascending (1) or descending (-1) sort order. For single-eld
indexes, the sort order of keys doesnt matter because MongoDB can traverse the index in either direction. However,
for compound indexes (page 11), sort order can matter in determining whether the index can support a sort operation.
Consider a collection events that contains documents with the elds username and date. Applications can issue
queries that return results sorted rst by ascending username values and then by descending (i.e. more recent to last)
date values, such as:
db.events.find().sort( { username: 1, date: -1 } )
or queries that return results sorted rst by descending username values and then by ascending date values, such
as:
db.events.find().sort( { username: -1, date: 1 } )
12
The following index can support both these sort operations:
db.events.ensureIndex( { "username" : 1, "date" : -1 } )
However, the above index cannot support sorting by ascending username values and then by ascending date
values, such as the following:
db.events.find().sort( { username: 1, date: 1 } )
Prexes Compound indexes support queries on any prex of the index elds. Index prexes are the beginning
subset of indexed elds. For example, given the index { a: 1, b: 1, c: 1 }, both { a: 1 } and {
a: 1, b: 1 } are prexes of the index.
If you have a collection that has a compound index on { a: 1, b: 1 }, as well as an index that consists of the
prex of that index, i.e. { a: 1 }, assuming none of the index has a sparse or unique constraints, then you can
drop the { a: 1 } index. MongoDB will be able to use the compound index in all of situations that it would have
used the { a: 1 } index.
For example, given the following index:
{ "item": 1, "location": 1, "stock": 1 }
MongoDB can use this index to support queries that include:
the item eld,
the item eld and the location eld,
the item eld and the location eld and the stock eld, or
only the item and stock elds; however, this index would be less efcient than an index on only item and
stock.
MongoDB cannot use this index to support queries that include:
only the location eld,
only the stock eld, or
only the location and stock elds.
Index Intersection Starting in version 2.6, MongoDB can use index intersection (page 31) to fulll queries. The
choice between creating compound indexes that support your queries or relying on index intersection depends on the
specics of your system. See Index Intersection and Compound Indexes (page 31) for more details.
Multikey Indexes
To index a eld that holds an array value, MongoDBadds index items for each itemin the array. These multikey indexes
allow MongoDB to return documents from queries using the value of an array. MongoDB automatically determines
whether to create a multikey index if the indexed eld contains an array value; you do not need to explicitly specify
the multikey type.
Consider the following illustration of a multikey index:
Multikey indexes support all operations supported by other MongoDB indexes; however, applications may use multi-
key indexes to select documents based on ranges of values for the value of an array. Multikey indexes support arrays
that hold both values (e.g. strings, numbers) and nested documents.
Limitations
13
Figure 9: Diagram of a multikey index on the addr.zip eld. The addr eld contains an array of address docu-
ments. The address documents contain the zip eld.
Interactions between Compound and Multikey Indexes While you can create multikey compound indexes
(page 11), at most one eld in a compound index may hold an array. For example, given an index on { a: 1,
b: 1 }, the following documents are permissible:
{a: [1, 2], b: 1}
{a: 1, b: [1, 2]}
However, the following document is impermissible, and MongoDB cannot insert such a document into a collection
with the {a: 1, b: 1 } index:
{a: [1, 2], b: [1, 2]}
If you attempt to insert a such a document, MongoDB will reject the insertion, and produce an error that says cannot
index parallel arrays. MongoDB does not index parallel arrays because they require the index to include
each value in the Cartesian product of the compound keys, which could quickly result in incredibly large and difcult
to maintain indexes.
Shard Keys
Important: The index of a shard key cannot be a multi-key index.
Hashed Indexes hashed indexes are not compatible with multi-key indexes.
To compute the hash for a hashed index, MongoDB collapses sub-documents and computes the hash for the entire
value. For elds that hold arrays or sub-documents, you cannot use the index to support queries that introspect the
sub-document.
Examples
14
Index Basic Arrays Given the following document:
{
"_id" : ObjectId("..."),
"name" : "Warm Weather",
"author" : "Steve",
"tags" : [ "weather", "hot", "record", "april" ]
}
Then an index on the tags eld, { tags: 1 }, would be a multikey index and would include these four separate
entries for that document:
"weather",
"hot",
"record", and
"april".
Queries could use the multikey index to return queries for any of the above values.
Index Arrays with Embedded Documents You can create multikey indexes on elds in objects embedded in arrays,
as in the following example:
Consider a feedback collection with documents in the following form:
{
"_id": ObjectId(...),
"title": "Grocery Quality",
"comments": [
{ author_id: ObjectId(...),
date: Date(...),
text: "Please expand the cheddar selection." },
{ author_id: ObjectId(...),
date: Date(...),
text: "Please expand the mustard selection." },
{ author_id: ObjectId(...),
date: Date(...),
text: "Please expand the olive selection." }
]
}
An index on the comments.text eld would be a multikey index and would add items to the index for all embedded
documents in the array.
With the index { "comments.text": 1 } on the feedback collection, consider the following query:
db.feedback.find( { "comments.text": "Please expand the olive selection." } )
The query would select the documents in the collection that contain the following embedded document in the
comments array:
{ author_id: ObjectId(...),
date: Date(...),
text: "Please expand the olive selection." }
15
Geospatial Indexes and Queries
MongoDB offers a number of indexes and query mechanisms to handle geospatial information. This section introduces
MongoDBs geospatial features. For complete examples of geospatial queries in MongoDB, see Geospatial Index
Tutorials (page 43).
Surfaces Before storing your location data and writing queries, you must decide the type of surface to use to perform
calculations. The type you choose affects how you store data, what type of index to build, and the syntax of your
queries.
MongoDB offers two surface types:
Spherical To calculate geometry over an Earth-like sphere, store your location data on a spherical surface and use
2dsphere (page 18) index.
Store your location data as GeoJSON objects with this coordinate-axis order: longitude, latitude. The coordinate
reference system for GeoJSON uses the WGS84 datum.
Flat To calculate distances on a Euclidean plane, store your location data as legacy coordinate pairs and use a 2d
(page 21) index.
Location Data If you choose spherical surface calculations, you store location data as either:
GeoJSON Objects Queries on GeoJSON objects always calculate on a sphere. The default coordinate reference
system for GeoJSON uses the WGS84 datum.
New in version 2.4: Support for GeoJSON storage and queries is new in version 2.4. Prior to version 2.4, all geospatial
data used coordinate pairs.
MongoDB supports the following GeoJSON objects:
Point
LineString
Polygon
Legacy Coordinate Pairs MongoDB supports spherical surface calculations on legacy coordinate pairs using a
2dsphere index by converting the data to the GeoJSON Point type.
If you choose at surface calculations, and use a 2d index you can store data only as legacy coordinate pairs.
Query Operations MongoDBs geospatial query operators let you query for:
Inclusion MongoDB can query for locations contained entirely within a specied polygon. Inclusion queries use
the $geoWithin operator.
Both 2d and 2dsphere indexes can support inclusion queries. MongoDB does not require an index for inclusion
queries after 2.2.3; however, these indexes will improve query performance.
16
Intersection MongoDB can query for locations that intersect with a specied geometry. These queries apply only
to data on a spherical surface. These queries use the $geoIntersects operator.
Only 2dsphere indexes support intersection.
Proximity MongoDB can query for the points nearest to another point. Proximity queries use the $near operator.
The $near operator requires a 2d or 2dsphere index.
Geospatial Indexes MongoDB provides the following geospatial index types to support the geospatial queries.
2dsphere 2dsphere (page 18) indexes support:
Calculations on a sphere
GeoJSON objects and include backwards compatibility for legacy coordinate pairs.
A compound index with scalar index elds (i.e. ascending or descending) as a prex or sufx of the 2dsphere
index eld
New in version 2.4: 2dsphere indexes are not available before version 2.4.
See also:
Query a 2dsphere Index (page 44)
2d 2d (page 21) indexes support:
Calculations using at geometry
Legacy coordinate pairs (i.e., geospatial points on a at coordinate system)
A compound index with only one additional eld, as a sufx of the 2d index eld
See also:
Query a 2d Index (page 47)
Geospatial Indexes and Sharding You cannot use a geospatial index as the shard key index.
You can create and maintain a geospatial index on a sharded collection if using elds other than shard key.
Queries using $near are not supported for sharded collections. Use geoNear instead. You also can query for
geospatial data using $geoWithin.
Additional Resources The following pages provide complete documentation for geospatial indexes and queries:
2dsphere Indexes (page 18) A 2dsphere index supports queries that calculate geometries on an earth-like sphere.
The index supports data stored as both GeoJSON objects and as legacy coordinate pairs.
2d Indexes (page 21) The 2d index supports data stored as legacy coordinate pairs and is intended for use in Mon-
goDB 2.2 and earlier.
Haystack Indexes (page 21) A haystack index is a special index optimized to return results over small areas. For
queries that use spherical geometry, a 2dsphere index is a better option than a haystack index.
2d Index Internals (page 22) Provides a more in-depth explanation of the internals of geospatial indexes. This mate-
rial is not necessary for normal operations but may be useful for troubleshooting and for further understanding.
17
2dsphere Indexes New in version 2.4.
A 2dsphere index supports queries that calculate geometries on an earth-like sphere. The index supports data stored
as both GeoJSON objects and as legacy coordinate pairs. The index supports legacy coordinate pairs by converting
the data to the GeoJSON Point type.
The 2dsphere index supports all MongoDB geospatial queries: queries for inclusion, intersection and proximity.
A compound (page 11) 2dsphere index can reference multiple location and non-location elds within a collections
documents. You can arrange the elds in any order.
The default datum for an earth-like sphere in MongoDB 2.4 is WGS84. Coordinate-axis order is longitude, latitude.
See the http://docs.mongodb.org/manualreference/operator/query-geospatial for the
query operators that support geospatial queries.
2dsphere Version 2 Changed in version 2.6.
MongoDB 2.6 introduces a version 2 of 2dsphere indexes. Version 2 is the default version of 2dsphere
indexes created in MongoDB 2.6. To create a 2dsphere index as a version 1, include the option {
"2dsphereIndexVersion": 1 } when creating the index.
Version 2 adds support for additional GeoJSON object: MultiPoint (page 19), MultiLineString (page 20), Multi-
LineString (page 20), MultiPolygon (page 20), and GeometryCollection (page 20).
Version 2 2dsphere indexes are sparse by default and ignores the sparse: true (page 26) option. If a document lacks
a 2dsphere index eld (or the eld is a null or an empty array), MongoDB does not add an entry for the document
to the 2dsphere index. For inserts, MongoDB inserts the document but does not add to the 2dsphere index.
Version 1 2dsphere indexes are not sparse by default and will reject documents with null location elds.
Considerations MongoDB allows only one geospatial index per collection. You can create either a 2dsphere or
a 2d (page 21) per collection.
You cannot use a 2dsphere index as a shard key when sharding a collection. However, you can create and maintain
a geospatial index on a sharded collection by using a different eld as the shard key.
GeoJSON Objects MongoDB supports the following GeoJSON objects:
Point (page 19)
LineString (page 19)
Polygon (page 19)
MultiPoint (page 19)
MultiLineString (page 20)
MultiPolygon (page 20)
GeometryCollection (page 20)
The MultiPoint (page 19), MultiLineString (page 20), MultiLineString (page 20), MultiPolygon (page 20), and Geom-
etryCollection (page 20) require 2dsphere index version 2.
In order to index GeoJSON data, you must store the data in a location eld that you name. The location eld contains
a subdocument with a type eld specifying the GeoJSON object type and a coordinates eld specifying the
objects coordinates. Always store coordinates in longitude, latitude order.
Use the following syntax:
18
{ <location field> : { type : "<GeoJSON type>" ,
coordinates : <coordinates> } }
Point New in version 2.4.
The following example stores a GeoJSON Point:
{ loc : { type : "Point" ,
coordinates : [ 40, 5 ] } }
LineString New in version 2.4.
The following example stores a GeoJSON LineString:
{ loc : { type : "LineString" ,
coordinates : [ [ 40 , 5 ] , [ 41 , 6 ] ] } }
Polygon New in version 2.4.
Polygons consist of an array of GeoJSON LinearRing coordinate arrays. These LinearRings are closed
LineStrings. Closed LineStrings have at least four coordinate pairs and specify the same position as the
rst and last coordinates.
The following example stores a GeoJSON Polygon with an exterior ring and no interior rings (or holes). Note the
rst and last coordinate pair with the [ 0 , 0 ] coordinate:
{ loc :
{ type : "Polygon" ,
coordinates : [ [ [ 0 , 0 ] , [ 3 , 6 ] , [ 6 , 1 ] , [ 0 , 0 ] ] ] } }
For Polygons with multiple rings:
The rst described ring must be the exterior ring.
The exterior ring cannot self-intersect.
Any interior ring must be entirely contained by the outer ring.
Interior rings cannot intersect or overlap each other. Interior rings can share an edge.
The following document represents a polygon with an interior ring as GeoJSON:
{ loc :
{ type : "Polygon" ,
coordinates : [ [ [ 0 , 0 ] , [ 3 , 6 ] , [ 6 , 1 ] , [ 0 , 0 ] ],
[ [ 2 , 2 ] , [ 3 , 3 ] , [ 4 , 2 ] , [ 2 , 2 ] ] ] } }
MultiPoint New in version 2.6: Requires 2dsphere index version 2.
The following example stores coordinates of GeoJSON type MultiPoint
3
:
{ loc: { "type": "MultiPoint",
"coordinates": [ [ -73.9580, 40.8003 ],
[ -73.9498, 40.7968 ],
[ -73.9737, 40.7648 ],
[ -73.9814, 40.7681 ] ] } }
3
http://geojson.org/geojson-spec.html#id5
19
Figure 10: Diagram of a Polygon with internal ring.
MultiLineString New in version 2.6: Requires 2dsphere index version 2.
The following example stores coordinates of GeoJSON type MultiLineString
4
:
{ loc: { "type": "MultiLineString",
"coordinates": [ [ [ -73.96943, 40.78519 ], [ -73.96082, 40.78095 ] ],
[ [ -73.96415, 40.79229 ], [ -73.95544, 40.78854 ] ],
[ [ -73.97162, 40.78205 ], [ -73.96374, 40.77715 ] ],
[ [ -73.97880, 40.77247 ], [ -73.97036, 40.76811 ] ] ] } }
MultiPolygon New in version 2.6: Requires 2dsphere index version 2.
The following example stores coordinates of GeoJSON type MultiPolygon
5
:
{ loc: { "type": "MultiPolygon",
"coordinates": [ [ [ [ -73.958, 40.8003 ], [ -73.9498, 40.7968 ], [ -73.9737, 40.7648 ], [ -73.9814, 40.7681 ], [ -73.958, 40.8003 ] ] ],
[ [ [ -73.958, 40.8003 ], [ -73.9498, 40.7968 ], [ -73.9737, 40.7648 ], [ -73.958, 40.8003 ] ] ] ] } }
GeometryCollection New in version 2.6: Requires 2dsphere index version 2.
The following example stores coordinates of GeoJSON type GeometryCollection
6
:
{ loc: { "type": "GeometryCollection",
"geometries": [ { "type": "MultiPoint",
"coordinates": [ [ -73.9580, 40.8003 ],
[ -73.9498, 40.7968 ],
[ -73.9737, 40.7648 ],
[ -73.9814, 40.7681 ] ] },
4
http://geojson.org/geojson-spec.html#id6
5
http://geojson.org/geojson-spec.html#id7
6
http://geojson.org/geojson-spec.html#geometrycollection
20
{ "type": "MultiLineString",
"coordinates": [ [ [ -73.96943, 40.78519 ], [ -73.96082, 40.78095 ] ],
[ [ -73.96415, 40.79229 ], [ -73.95544, 40.78854 ] ],
[ [ -73.97162, 40.78205 ], [ -73.96374, 40.77715 ] ],
[ [ -73.97880, 40.77247 ], [ -73.97036, 40.76811 ] ] ] } ] } }
2d Indexes Use a 2d index for data stored as points on a two-dimensional plane. The 2d index is intended for
legacy coordinate pairs used in MongoDB 2.2 and earlier.
Use a 2d index if:
your database has legacy location data from MongoDB 2.2 or earlier, and
you do not intend to store any location data as GeoJSON objects.
See the http://docs.mongodb.org/manualreference/operator/query-geospatial for the
query operators that support geospatial queries.
Considerations MongoDB allows only one geospatial index per collection. You can create either a 2d or a 2dsphere
(page 18) per collection.
Do not use a 2d index if your location data includes GeoJSON objects. To index on both legacy coordinate pairs and
GeoJSON objects, use a 2dsphere (page 18) index.
You cannot use a 2d index as a shard key when sharding a collection. However, you can create and maintain a
geospatial index on a sharded collection by using a different eld as the shard key.
Behavior The 2d index supports calculations on a at, Euclidean plane. The 2d index also supports distance-only
calculations on a sphere, but for geometric calculations (e.g. $geoWithin) on a sphere, store data as GeoJSON
objects and use the 2dsphere index type.
A 2d index can reference two elds. The rst must be the location eld. A 2d compound index constructs queries
that select rst on the location eld, and then lters those results by the additional criteria. A compound 2d index can
cover queries.
Points on a 2D Plane To store location data as legacy coordinate pairs, use an array or an embedded document.
When possible, use the array format:
loc : [ <longitude> , <latitude> ]
Consider the embedded document form:
loc : { lng : <longitude> , lat : <latitude> }
Arrays are preferred as certain languages do not guarantee associative map ordering.
For all points, if you use longitude and latitude, store coordinates in longitude, latitude order.
Haystack Indexes A haystack index is a special index that is optimized to return results over small areas. Haystack
indexes improve performance on queries that use at geometry.
For queries that use spherical geometry, a 2dsphere index is a better option than a haystack index. 2dsphere indexes
(page 18) allow eld reordering; haystack indexes require the rst eld to be the location eld. Also, haystack indexes
are only usable via commands and so always return all results at once.
21
Haystack indexes create buckets of documents from the same geographic area in order to improve performance for
queries limited to that area. Each bucket in a haystack index contains all the documents within a specied proximity
to a given longitude and latitude.
To create a geohaystacks index, see Create a Haystack Index (page 49). For information and example on querying a
haystack index, see Query a Haystack Index (page 49).
2d Index Internals This document provides a more in-depth explanation of the internals of MongoDBs 2d geospa-
tial indexes. This material is not necessary for normal operations or application development but may be useful for
troubleshooting and for further understanding.
Calculation of Geohash Values for 2d Indexes When you create a geospatial index on legacy coordinate pairs,
MongoDB computes geohash values for the coordinate pairs within the specied location range (page 46) and then
indexes the geohash values.
To calculate a geohash value, recursively divide a two-dimensional map into quadrants. Then assign each quadrant a
two-bit value. For example, a two-bit representation of four quadrants would be:
01 11
00 10
These two-bit values (00, 01, 10, and 11) represent each of the quadrants and all points within each quadrant. For
a geohash with two bits of resolution, all points in the bottom left quadrant would have a geohash of 00. The top
left quadrant would have the geohash of 01. The bottom right and top right would have a geohash of 10 and 11,
respectively.
To provide additional precision, continue dividing each quadrant into sub-quadrants. Each sub-quadrant would have
the geohash value of the containing quadrant concatenated with the value of the sub-quadrant. The geohash for the
upper-right quadrant is 11, and the geohash for the sub-quadrants would be (clockwise from the top left): 1101,
1111, 1110, and 1100, respectively.
Multi-location Documents for 2d Indexes New in version 2.0: Support for multiple locations in a document.
While 2d geospatial indexes do not support more than one set of coordinates in a document, you can use a multi-key
index (page 13) to index multiple coordinate pairs in a single document. In the simplest example you may have a eld
(e.g. locs) that holds an array of coordinates, as in the following example:
{ _id : ObjectId(...),
locs : [ [ 55.5 , 42.3 ] ,
[ -74 , 44.74 ] ,
{ lng : 55.5 , lat : 42.3 } ]
}
The values of the array may be either arrays, as in [ 55.5, 42.3 ], or embedded documents, as in { lng :
55.5 , lat : 42.3 }.
You could then create a geospatial index on the locs eld, as in the following:
db.places.ensureIndex( { "locs": "2d" } )
You may also model the location data as a eld inside of a sub-document. In this case, the document would contain
a eld (e.g. addresses) that holds an array of documents where each document has a eld (e.g. loc:) that holds
location coordinates. For example:
22
{ _id : ObjectId(...),
name : "...",
addresses : [ {
context : "home" ,
loc : [ 55.5, 42.3 ]
} ,
{
context : "home",
loc : [ -74 , 44.74 ]
}
]
}
You could then create the geospatial index on the addresses.loc eld as in the following example:
db.records.ensureIndex( { "addresses.loc": "2d" } )
To include the location eld with the distance eld in multi-location document queries, specify includeLocs:
true in the geoNear command.
See also:
geospatial-query-compatibility-chart
Text Indexes
New in version 2.4.
MongoDB provides text indexes to support text search of string content in documents of a collection.
text indexes can include any eld whose value is a string or an array of string elements. To perform queries that
access the text index, use the $text query operator.
Changed in version 2.6: MongoDB enables the text search feature by default. In MongoDB 2.4, you need to enable
the text search feature manually to create text indexes and perform text search (page 24).
Create Text Index To create a text index, use the db.collection.ensureIndex() method. To index a
eld that contains a string or an array of string elements, include the eld and specify the string literal "text" in the
index document, as in the following example:
db.reviews.ensureIndex( { comments: "text" } )
A collection can have at most one text index.
For examples of creating text indexes on multiple elds, see Create a text Index (page 52).
Supported Languages and Stop Words MongoDB supports text search for various languages. text indexes drop
language-specic stop words (e.g. in English, the, an, a, and, etc.) and uses simple language-specic sufx
stemming. For a list of the supported languages, see Text Search Languages (page 67).
If the index language is English, text indexes are case-insensitive for non-diacritics; i.e. case insensitive for [A-z].
To specify a language for the text index, see Specify a Language for Text Index (page 53)
Restrictions
Text Search and Hints You cannot use hint() if the query includes a $text query expression.
23
Compound Index A compound index (page 11) can include a text index key in combination with ascend-
ing/descending index keys. However, these compound indexes have the following restrictions:
A compound text index cannot include any other special index types, such as multi-key (page 13) or geospatial
(page 17) index elds.
If the compound text index includes keys preceding the text index key, to perform a $text search, the query
predicate must include equality match conditions on the preceding keys.
See Limit the Number of Entries Scanned (page 56).
Storage Requirements and Performance Costs text indexes have the following storage requirements and per-
formance costs:
text indexes change the space allocation method for all future record allocations in a collection to
usePowerOf2Sizes.
text indexes can be large. They contain one index entry for each unique post-stemmed word in each indexed
eld for each document inserted.
Building a text index is very similar to building a large multi-key index and will take longer than building a
simple ordered (scalar) index on the same data.
When building a large text index on an existing collection, ensure that you have a sufciently high limit on
open le descriptors. See the recommended settings.
text indexes will impact insertion throughput because MongoDB must add an index entry for each unique
post-stemmed word in each indexed eld of each new source document.
Additionally, text indexes do not store phrases or information about the proximity of words in the documents.
As a result, phrase queries will run much more effectively when the entire collection ts in RAM.
Text Search Text search supports the search of string content in documents of a collection. MongoDB provides the
$text operator to perform text search in queries and in aggregation pipelines (page 57).
The text search process:
tokenizes and stems the search term(s) during both the index creation and the text command execution.
assigns a score to each document that contains the search term in the indexed elds. The score determines the
relevance of a document to a given search query.
The $text operator can search for words and phrases. The query matches on the complete stemmed words. For
example, if a document eld contains the word blueberry, a search on the termblue will not match the document.
However, a search on either blueberry or blueberries will match.
For information and examples on various text search patterns, see the $text query operator. For examples of text
search in aggregation pipeline, see Text Search in the Aggregation Pipeline (page 57).
Hashed Index
New in version 2.4.
Hashed indexes maintain entries with hashes of the values of the indexed eld. The hashing function collapses sub-
documents and computes the hash for the entire value but does not support multi-key (i.e. arrays) indexes.
Hashed indexes support sharding a collection using a hashed shard key. Using a
hashed shard key to shard a collection ensures a more even distribution of data. See
http://docs.mongodb.org/manualtutorial/shard-collection-with-a-hashed-shard-key
for more details.
24
MongoDB can use the hashed index to support equality queries, but hashed indexes do not support range queries.
You may not create compound indexes that have hashed index elds or specify a unique constraint
on a hashed index; however, you can create both a hashed index and an ascending/descending
(i.e. non-hashed) index on the same eld: MongoDB will use the scalar index for range queries.
Warning: MongoDB hashed indexes truncate oating point numbers to 64-bit integers before hashing. For
example, a hashed index would store the same value for a eld that held a value of 2.3, 2.2, and 2.9. To
prevent collisions, do not use a hashed index for oating point numbers that cannot be reliably converted to
64-bit integers (and then back to oating point). MongoDB hashed indexes do not support oating point values
larger than 2
53
.
Create a hashed index using an operation that resembles the following:
db.active.ensureIndex( { a: "hashed" } )
This operation creates a hashed index for the active collection on the a eld.
2.2 Index Properties
In addition to the numerous index types (page 8) MongoDB supports, indexes can also have various properties. The
following documents detail the index properties that you can select when building an index.
TTL Indexes (page 25) The TTL index is used for TTL collections, which expire data after a period of time.
Unique Indexes (page 26) A unique index causes MongoDB to reject all documents that contain a duplicate value
for the indexed eld.
Sparse Indexes (page 26) A sparse index does not index documents that do not have the indexed eld.
TTL Indexes
TTL indexes are special indexes that MongoDB can use to automatically remove documents from a collection after
a certain amount of time. This is ideal for some types of information like machine generated event data, logs, and
session information that only need to persist in a database for a limited amount of time.
Considerations
TTL indexes have the following limitations:
Compound indexes (page 11) are not supported.
The indexed eld must be a date type.
If the eld holds an array, and there are multiple date-typed data in the index, the document will expire when
the lowest (i.e. earliest) matches the expiration threshold.
The TTL index does not guarantee that expired data will be deleted immediately. There may be a delay between the
time a document expires and the time that MongoDB removes the document from the database.
The background task that removes expired documents runs every 60 seconds. As a result, documents may remain in a
collection after they expire but before the background task runs or completes.
The duration of the removal operation depends on the workload of your mongod instance. Therefore, expired data
may exist for some time beyond the 60 second period between runs of the background task.
In all other respects, TTL indexes are normal indexes, and if appropriate, MongoDB can use these indexes to fulll
arbitrary queries.
25
Additional Information
http://docs.mongodb.org/manualtutorial/expire-data
Unique Indexes
A unique index causes MongoDB to reject all documents that contain a duplicate value for the indexed eld. To create
a unique index on the user_id eld of the members collection, use the following operation in the mongo shell:
db.addresses.ensureIndex( { "user_id": 1 }, { unique: true } )
By default, unique is false on MongoDB indexes.
If you use the unique constraint on a compound index (page 11), then MongoDB will enforce uniqueness on the
combination of values rather than the individual value for any or all values of the key.
If a document does not have a value for the indexed eld in a unique index, the index will store a null value for this
document. Because of the unique constraint, MongoDB will only permit one document that lacks the indexed eld. If
there is more than one document without a value for the indexed eld or is missing the indexed eld, the index build
will fail with a duplicate key error.
You can combine the unique constraint with the sparse index (page 26) to lter these null values from the unique index
and avoid the error.
You may not specify a unique constraint on a hashed index (page 24).
Sparse Indexes
Sparse indexes only contain entries for documents that have the indexed eld, even if the index eld contains a null
value. The index skips over any document that is missing the indexed eld. The index is sparse because it does not
include all documents of a collection. By contrast, non-sparse indexes contain all documents in a collection, storing
null values for those documents that do not contain the indexed eld.
The following example in the mongo shell creates a sparse index on the xmpp_id eld of the addresses collection:
db.addresses.ensureIndex( { "xmpp_id": 1 }, { sparse: true } )
By default, sparse is false on MongoDB indexes.
Changed in version 2.6: If a sparse index results in an incomplete result set for queries and sort operations, MongoDB
will not use that index unless a hint() explicitly species the index. For example, the query { x: { $exists:
false } } will not use a sparse index on the x eld unless explicitly hinted. See Sparse Index On A Collection
Cannot Return Complete Results (page 27) for an example that details the behavior.
For 2dsphere indexes (version 2) (page 18), MongoDB ignores the sparse ag.
Note: Do not confuse sparse indexes in MongoDB with block-level
7
indexes in other databases. Think of them as
dense indexes with a specic lter.
Tip
You can specify a sparse and unique index (page 26), that rejects documents that have duplicate values for a eld, but
allows multiple documents that omit that key.
7
http://en.wikipedia.org/wiki/Database_index#Sparse_index
26
Examples
Create a Sparse Index On A Collection Consider a collection scores that contains the following documents:
{ "_id" : ObjectId("523b6e32fb408eea0eec2647"), "userid" : "newbie" }
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 }
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 }
The collection has a sparse index on the eld score:
db.scores.ensureIndex( { score: 1 } , { sparse: true } )
Then, the following query on the scores collection uses the sparse index to return the documents that have the
score eld less than ($lt) 90:
db.scores.find( { score: { $lt: 90 } } )
Because the document for the userid "newbie" does not contain the score eld and thus does not meet the query
criteria, the query can use the sparse index to return the results:
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 }
Sparse Index On A Collection Cannot Return Complete Results Consider a collection scores that contains the
following documents:
{ "_id" : ObjectId("523b6e32fb408eea0eec2647"), "userid" : "newbie" }
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 }
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 }
The collection has a sparse index on the eld score:
db.scores.ensureIndex( { score: 1 } , { sparse: true } )
Because the document for the userid "newbie" does not contain the score eld, the sparse index does not contain
an entry for that document.
Consider the following query to return all documents in the scores collection, sorted by the score eld:
db.scores.find().sort( { score: -1 } )
Even though the sort is by the indexed eld, MongoDB will not select the sparse index to fulll the query in order to
return complete results:
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 }
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 }
{ "_id" : ObjectId("523b6e32fb408eea0eec2647"), "userid" : "newbie" }
To use the sparse index, explicitly specify the index with hint():
db.scores.find().sort( { score: -1 } ).hint( { score: 1 } )
The use of the index results in the return of only those documents with the score eld:
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 }
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 }
See also:
explain() and http://docs.mongodb.org/manualtutorial/analyze-query-plan
27
Sparse Index with Unique Constraint Consider a collection scores that contains the following documents:
{ "_id" : ObjectId("523b6e32fb408eea0eec2647"), "userid" : "newbie" }
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 }
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 }
You could create an index with a unique constraint (page 26) and sparse lter on the score eld using the following
operation:
db.scores.ensureIndex( { score: 1 } , { sparse: true, unique: true } )
This index would permit the insertion of documents that had unique values for the score eld or did not include a
score eld. Consider the following insert operation:
db.scores.insert( { "userid": "AAAAAAA", "score": 43 } )
db.scores.insert( { "userid": "BBBBBBB", "score": 34 } )
db.scores.insert( { "userid": "CCCCCCC" } )
db.scores.insert( { "userid": "DDDDDDD" } )
However, the index would not permit the addition of the following documents since documents already exists with
score value of 82 and 90:
db.scores.insert( { "userid": "AAAAAAA", "score": 82 } )
db.scores.insert( { "userid": "BBBBBBB", "score": 90 } )
2.3 Index Creation
MongoDB provides several options that only affect the creation of the index. Specify these options in a document as
the second argument to the db.collection.ensureIndex() method. This section describes the uses of these
creation options and their behavior.
Related
Some options that you can specify to ensureIndex() options control the properties of the index (page 25), which
are not index creation options. For example, the unique (page 26) option affects the behavior of the index after creation.
For a detailed description of MongoDBs index types, see Index Types (page 8) and Index Properties (page 25) for
related documentation.
Background Construction
By default, creating an index blocks all other operations on a database. When building an index on a collection, the
database that holds the collection is unavailable for read or write operations until the index build completes. Any
operation that requires a read or write lock on all databases (e.g. listDatabases) will wait for the foreground index
build to complete.
For potentially long running index building operations, consider the background operation so that the MongoDB
database remains available during the index building operation. For example, to create an index in the background of
the zipcode eld of the people collection, issue the following:
db.people.ensureIndex( { zipcode: 1}, {background: true} )
By default, background is false for building MongoDB indexes.
You can combine the background option with other options, as in the following:
28
db.people.ensureIndex( { zipcode: 1}, {background: true, sparse: true } )
Behavior
As of MongoDB version 2.4, a mongod instance can build more than one index in the background concurrently.
Changed in version 2.4: Before 2.4, a mongod instance could only build one background index per database at a time.
Changed in version 2.2: Before 2.2, a single mongod instance could only build one index at a time.
Background indexing operations run in the background so that other database operations can run while creating the
index. However, the mongo shell session or connection where you are creating the index will block until the index
build is complete. To continue issuing commands to the database, open another connection or mongo instance.
Queries will not use partially-built indexes: the index will only be usable once the index build is complete.
Note: If MongoDB is building an index in the background, you cannot perform other administra-
tive operations involving that collection, including running repairDatabase, dropping the collection (i.e.
db.collection.drop()), and running compact. These operations will return an error during background
index builds.
Performance
The background index operation uses an incremental approach that is slower than the normal foreground index
builds. If the index is larger than the available RAM, then the incremental process can take much longer than the
foreground build.
If your application includes ensureIndex() operations, and an index doesnt exist for other operational concerns,
building the index can have a severe impact on the performance of the database.
To avoid performance issues, make sure that your application checks for the indexes at start up using the
getIndexes() method or the equivalent method for your driver
8
and terminates if the proper indexes do not ex-
ist. Always build indexes in production instances using separate application code, during designated maintenance
windows.
Building Indexes on Secondaries
Changed in version 2.6: Secondary members can now build indexes in the background. Previously all index builds on
secondaries were in the foreground.
Background index operations on a replica set secondaries begin after the primary completes building the index. If
MongoDB builds an index in the background on the primary, the secondaries will then build that index in the back-
ground.
To build large indexes on secondaries the best approach is to restart one secondary at a time in standalone mode and
build the index. After building the index, restart as a member of the replica set, allow it to catch up with the other
members of the set, and then build the index on the next secondary. When all the secondaries have the new index, step
down the primary, restart it as a standalone, and build the index on the former primary.
The amount of time required to build the index on a secondary must be within the window of the oplog, so that the
secondary can catch up with the primary.
8
http://api.mongodb.org/
29
Indexes on secondary members in recovering mode are always built in the foreground to allow them to catch up as
soon as possible.
See Build Indexes on Replica Sets (page 37) for a complete procedure for building indexes on secondaries.
Drop Duplicates
MongoDB cannot create a unique index (page 26) on a eld that has duplicate values. To force the creation of a unique
index, you can specify the dropDups option, which will only index the rst occurrence of a value for the key, and
delete all subsequent values.
Important: As in all unique indexes, if a document does not have the indexed eld, MongoDB will include it in the
index with a null value.
If subsequent elds do not have the indexed eld, and you have set {dropDups: true}, MongoDB will remove
these documents from the collection when creating the index. If you combine dropDups with the sparse (page 26)
option, this index will only include documents in the index that have the value, and the documents without the eld
will remain in the database.
To create a unique index that drops duplicates on the username eld of the accounts collection, use a command
in the following form:
db.accounts.ensureIndex( { username: 1 }, { unique: true, dropDups: true } )
Warning: Specifying { dropDups: true } will delete data from your database. Use with extreme cau-
tion.
By default, dropDups is false.
Index Names
The default name for an index is the concatenation of the indexed keys and each keys direction in the index, 1 or -1.
Example
Issue the following command to create an index on item and quantity:
db.products.ensureIndex( { item: 1, quantity: -1 } )
The resulting index is named: item_1_quantity_-1.
Optionally, you can specify a name for an index instead of using the default name.
Example
Issue the following command to create an index on item and quantity and specify inventory as the index
name:
db.products.ensureIndex( { item: 1, quantity: -1 } , { name: "inventory" } )
The resulting index has the name inventory.
To view the name of an index, use the getIndexes() method.
30
2.4 Index Intersection
New in version 2.6.
MongoDB can use the intersection of multiple indexes to fulll queries.
9
In general, each index intersection involves
two indexes; however, MongoDB can employ multiple/nested index intersections to resolve a query.
To illustrate index intersection, consider a collection orders that has the following indexes:
{ qty: 1 }
{ item: 1 }
MongoDB can use the intersection of the two indexes to support the following query:
db.orders.find( { item: "abc123", qty: { $gt: 15 } } )
For query plans that use index intersection, the explain() returns the value Complex Plan in the cursor eld.
Index Prex Intersection
With index intersection, MongoDB can use an intersection of either the entire index or the index prex. An index
prex is a subset of a compound index, consisting of one or more keys starting from the beginning of the index.
Consider a collection orders with the following indexes:
{ qty: 1 }
{ status: 1, ord_date: -1 }
To fulll the following query which species a condition on both the qty eld and the status eld, MongoDB can
use the intersection of the two indexes:
db.orders.find( { qty: { $gt: 10 } , status: "A" } )
Index Intersection and Compound Indexes
Index intersection does not eliminate the need for creating compound indexes (page 11). However, because both the
list order (i.e. the order in which the keys are listed in the index) and the sort order (i.e. ascending or descending),
matter in compound indexes (page 11), a compound index may not support a query condition that does not include the
index prex keys (page 13) or that species a different sort order.
For example, if a collection orders has the following compound index, with the status eld listed before the
ord_date eld:
{ status: 1, ord_date: -1 }
The compound index can support the following queries:
db.orders.find( { status: { $in: ["A", "P" ] } } )
db.orders.find(
{
ord_date: { $gt: new Date("2014-02-01") },
status: {$in:[ "P", "A" ] }
}
)
But not the following two queries:
9
In previous versions, MongoDB could use only a single index to fulll most queries. The exception to this is queries with $or clauses, which
could use a single index for each $or clause.
31
db.orders.find( { ord_date: { $gt: new Date("2014-02-01") } } )
db.orders.find( { } ).sort( { ord_date: 1 } )
However, if the collection has two separate indexes:
{ status: 1 }
{ ord_date: -1 }
The two indexes can, either individually or through index intersection, support all four aforementioned queries.
The choice between creating compound indexes that support your queries or relying on index intersection depends on
the specics of your system.
See also:
compound indexes (page 11), Create Compound Indexes to Support Several Different Queries (page 60)
Index Intersection and Sort
Index intersection does not apply when the sort() operation requires an index completely separate from the query
predicate.
For example, the orders collection has the following indexes:
{ qty: 1 }
{ status: 1, ord_date: -1 }
{ status: 1 }
{ ord_date: -1 }
MongoDB cannot use index intersection for the following query with sort:
db.orders.find( { qty: { $gt: 10 } } ).sort( { status: 1 } )
That is, MongoDB does not use the { qty: 1 } index for the query, and the separate { status: 1 } or the
{ status: 1, ord_date: -1 } index for the sort.
However, MongoDB can use index intersection for the following query with sort since the index { status: 1,
ord_date: -1 } can fulll part of the query predicate.
db.orders.find( { qty: { $gt: 10 } , status: "A" } ).sort( { ord_date: -1 } )
3 Indexing Tutorials
Indexes allow MongoDB to process and fulll queries quickly by creating small and efcient representations of the
documents in a collection.
The documents in this section outline specic tasks related to building and maintaining indexes for data in MongoDB
collections and discusses strategies and practical approaches. For a conceptual overview of MongoDB indexing, see
the Index Concepts (page 8) document.
Index Creation Tutorials (page 33) Create and congure different types of indexes for different purposes.
Index Management Tutorials (page 40) Monitor and assess index performance and rebuild indexes as needed.
Geospatial Index Tutorials (page 43) Create indexes that support data stored as GeoJSON objects and legacy coor-
dinate pairs.
Text Search Tutorials (page 52) Build and congure indexes that support full-text searches.
32
Indexing Strategies (page 59) The factors that affect index performance and practical approaches to indexing in Mon-
goDB
3.1 Index Creation Tutorials
Instructions for creating and conguring indexes in MongoDB and building indexes on replica sets and sharded clus-
ters.
Create an Index (page 33) Build an index for any eld on a collection.
Create a Compound Index (page 34) Build an index of multiple elds on a collection.
Create a Unique Index (page 35) Build an index that enforces unique values for the indexed eld or elds.
Create a Sparse Index (page 36) Build an index that omits references to documents that do not include the indexed
eld. This saves space when indexing elds that are present in only some documents.
Create a Hashed Index (page 36) Compute a hash of the value of a eld in a collection and index the hashed value.
These indexes permit equality queries and may be suitable shard keys for some collections.
Build Indexes on Replica Sets (page 37) To build indexes on a replica set, you build the indexes separately on the
primary and the secondaries, as described here.
Build Indexes in the Background (page 39) Background index construction allows read and write operations to con-
tinue while building the index, but take longer to complete and result in a larger index.
Build Old Style Indexes (page 39) A {v : 0} index is necessary if you need to roll back from MongoDB version
2.0 (or later) to MongoDB version 1.8.
Create an Index
Indexes allow MongoDB to process and fulll queries quickly by creating small and efcient representations of the
documents in a collection. MongoDB creates an index on the _id eld of every collection by default, but allows users
to create indexes for any collection using on any eld in a document.
This tutorial describes how to create an index on a single eld. MongoDB also supports compound indexes (page 11),
which are indexes on multiple elds. See Create a Compound Index (page 34) for instructions on building compound
indexes.
Create an Index on a Single Field
To create an index, use ensureIndex() or a similar method from your driver
10
. For example the following creates
an index on the phone-number eld of the people collection:
db.people.ensureIndex( { "phone-number": 1 } )
ensureIndex() only creates an index if an index of the same specication does not already exist.
All indexes support and optimize the performance for queries that select on this eld. For queries that cannot use an
index, MongoDB must scan all documents in a collection for documents that match the query.
Tip
The value of the eld in the index specication describes the kind of index for that eld. For example, a value of 1
species an index that orders items in ascending order. A value of -1 species an index that orders items in descending
order.
10
http://api.mongodb.org/
33
Examples
If you create an index on the user_id eld in the records, this index is, the index will support the following
query:
db.records.find( { user_id: 2 } )
However, the following query, on the profile_url eld is not supported by this index:
db.records.find( { profile_url: 2 } )
Additional Considerations
If your collection holds a large amount of data, and your application needs to be able to access the data while building
the index, consider building the index in the background, as described in Background Construction (page 28). To build
indexes on replica sets, see the Build Indexes on Replica Sets (page 37) section for more information.
Note: To build or rebuild indexes for a replica set see Build Indexes on Replica Sets (page 37).
Some drivers may specify indexes, using NumberLong(1) rather than 1 as the specication. This does not have any
affect on the resulting index.
See also:
Create a Compound Index (page 34), Indexing Tutorials (page 32) and Index Concepts (page 8) for more information.
Create a Compound Index
Indexes allow MongoDB to process and fulll queries quickly by creating small and efcient representations of the
documents in a collection. MongoDB supports indexes that include content on a single eld, as well as compound
indexes (page 11) that include content from multiple elds. Continue reading for instructions and examples of building
a compound index.
Build a Compound Index
To create a compound index (page 11) use an operation that resembles the following prototype:
db.collection.ensureIndex( { a: 1, b: 1, c: 1 } )
Example
The following operation will create an index on the item, category, and price elds of the products collec-
tion:
db.products.ensureIndex( { item: 1, category: 1, price: 1 } )
Additional Considerations
If your collection holds a large amount of data, and your application needs to be able to access the data while building
the index, consider building the index in the background, as described in Background Construction (page 28). To build
indexes on replica sets, see the Build Indexes on Replica Sets (page 37) section for more information.
34
Note: To build or rebuild indexes for a replica set see Build Indexes on Replica Sets (page 37).
Some drivers may specify indexes, using NumberLong(1) rather than 1 as the specication. This does not have any
affect on the resulting index.
Tip
The value of the eld in the index specication describes the kind of index for that eld. For example, a value of 1
species an index that orders items in ascending order. A value of -1 species an index that orders items in descending
order.
See also:
Create an Index (page 33), Indexing Tutorials (page 32) and Index Concepts (page 8) for more information.
Create a Unique Index
MongoDB allows you to specify a unique constraint (page 26) on an index. These constraints prevent applications
from inserting documents that have duplicate values for the inserted elds. Additionally, if you want to create an index
on a collection that has existing data that might have duplicate values for the indexed eld, you may choose to combine
unique enforcement with duplicate dropping (page 30).
Unique Indexes
To create a unique index (page 26), consider the following prototype:
db.collection.ensureIndex( { a: 1 }, { unique: true } )
For example, you may want to create a unique index on the "tax-id": of the accounts collection to prevent
storing multiple account records for the same legal entity:
db.accounts.ensureIndex( { "tax-id": 1 }, { unique: true } )
The _id index (page 10) is a unique index. In some situations you may consider using the _id eld itself for this kind
of data rather than using a unique index on another eld.
In many situations you will want to combine the unique constraint with the sparse option. When MongoDB
indexes a eld, if a document does not have a value for a eld, the index entry for that item will be null. Since
unique indexes cannot have duplicate values for a eld, without the sparse option, MongoDB will reject the second
document and all subsequent documents without the indexed eld. Consider the following prototype.
db.collection.ensureIndex( { a: 1 }, { unique: true, sparse: true } )
You can also enforce a unique constraint on compound indexes (page 11), as in the following prototype:
db.collection.ensureIndex( { a: 1, b: 1 }, { unique: true } )
These indexes enforce uniqueness for the combination of index keys and not for either key individually.
Drop Duplicates
To force the creation of a unique index (page 26) index on a collection with duplicate values in the eld you are indexing
you can use the dropDups option. This will force MongoDB to create a unique index by deleting documents with
duplicate values when building the index. Consider the following prototype invocation of ensureIndex():
35
db.collection.ensureIndex( { a: 1 }, { unique: true, dropDups: true } )
See the full documentation of duplicate dropping (page 30) for more information.
Warning: Specifying { dropDups: true } may delete data from your database. Use with extreme cau-
tion.
Refer to the ensureIndex() documentation for additional index creation options.
Create a Sparse Index
Sparse indexes are like non-sparse indexes, except that they omit references to documents that do not include the
indexed eld. For elds that are only present in some documents sparse indexes may provide a signicant space
savings. See Sparse Indexes (page 26) for more information about sparse indexes and their use.
See also:
Index Concepts (page 8) and Indexing Tutorials (page 32) for more information.
Prototype
To create a sparse index (page 26) on a eld, use an operation that resembles the following prototype:
db.collection.ensureIndex( { a: 1 }, { sparse: true } )
Example
The following operation, creates a sparse index on the users collection that only includes a document in the index if
the twitter_name eld exists in a document.
db.users.ensureIndex( { twitter_name: 1 }, { sparse: true } )
The index excludes all documents that do not include the twitter_name eld.
Considerations
Note: Sparse indexes can affect the results returned by the query, particularly with respect to sorts on elds not
included in the index. See the sparse index (page 26) section for more information.
Create a Hashed Index
New in version 2.4.
Hashed indexes (page 24) compute a hash of the value of a eld in a collection and index the hashed value. These
indexes permit equality queries and may be suitable shard keys for some collections.
Tip
MongoDB automatically computes the hashes when resolving queries using hashed indexes. Applications do not need
to compute hashes.
36
See
sharding-hashed-sharding for more information about hashed indexes in sharded clusters, as well as Index Concepts
(page 8) and Indexing Tutorials (page 32) for more information about indexes.
Procedure
To create a hashed index (page 24), specify hashed as the value of the index key, as in the following example:
Example
Specify a hashed index on _id
db.collection.ensureIndex( { _id: "hashed" } )
Considerations
MongoDB supports hashed indexes of any single eld. The hashing function collapses sub-documents and computes
the hash for the entire value, but does not support multi-key (i.e. arrays) indexes.
You may not create compound indexes that have hashed index elds.
Build Indexes on Replica Sets
Background index creation operations (page 28) become foreground indexing operations on secondary members of
replica sets. The foreground index building process blocks all replication and read operations on the secondaries while
they build the index.
Secondaries will begin building indexes after the primary nishes building the index. In sharded clusters, the mongos
will send ensureIndex() to the primary members of the replica set for each shard, which then replicate to the
secondaries after the primary nishes building the index.
To minimize the impact of building an index on your replica set, use the following procedure to build indexes on
secondaries:
See
Indexing Tutorials (page 32) and Index Concepts (page 8) for more information.
Considerations
Ensure that your oplog is large enough to permit the indexing or re-indexing operation to complete without
falling too far behind to catch up. See the oplog sizing documentation for additional information.
This procedure does take one member out of the replica set at a time. However, this procedure will only affect
one member of the set at a time rather than all secondaries at the same time.
Do not use this procedure when building a unique index (page 26) with the dropDups option.
37
Procedure
Note: If you need to build an index in a sharded cluster, repeat the following procedure for each replica set that
provides each shard.
Stop One Secondary Stop the mongod process on one secondary. Restart the mongod process without the
--replSet option and running on a different port.
11
This instance is now in standalone mode.
For example, if your mongod normally runs with on the default port of 27017 with the --replSet option you
would use the following invocation:
mongod --port 47017
Build the Index Create the new index using the ensureIndex() in the mongo shell, or comparable method in
your driver. This operation will create or rebuild the index on this mongod instance
For example, to create an ascending index on the username eld of the records collection, use the following
mongo shell operation:
db.records.ensureIndex( { username: 1 } )
See also:
Create an Index (page 33) and Create a Compound Index (page 34) for more information.
Restart the Program mongod When the index build completes, start the mongod instance with the --replSet
option on its usual port:
mongod --port 27017 --replSet rs0
Modify the port number (e.g. 27017) or the replica set name (e.g. rs0) as needed.
Allow replication to catch up on this member.
Build Indexes on all Secondaries For each secondary in the set, build an index according to the following steps:
1. Stop One Secondary (page 38)
2. Build the Index (page 38)
3. Restart the Program mongod (page 38)
Build the Index on the Primary To build an index on the primary you can either:
1. Build the index in the background (page 39) on the primary.
2. Step down the primary using the rs.stepDown() method in the mongo shell to cause the current primary to
become a secondary graceful and allow the set to elect another member as primary.
Then repeat the index building procedure, listed below, to build the index on the primary:
(a) Stop One Secondary (page 38)
(b) Build the Index (page 38)
11
By running the mongod on a different port, you ensure that the other members of the replica set and all clients will not contact the member
while you are building the index.
38
(c) Restart the Program mongod (page 38)
Building the index on the background, takes longer than the foreground index build and results in a less compact index
structure. Additionally, the background index build may impact write performance on the primary. However, building
the index in the background allows the set to be continuously up for write operations during while MongoDB builds
the index.
Build Indexes in the Background
By default, MongoDB builds indexes in the foreground, which prevents all read and write operations to the database
while the index builds. Also, no operation that requires a read or write lock on all databases (e.g. listDatabases) can
occur during a foreground index build.
Background index construction (page 28) allows read and write operations to continue while building the index.
See also:
Index Concepts (page 8) and Indexing Tutorials (page 32) for more information.
Considerations
Background index builds take longer to complete and result in an index that is initially larger, or less compact, than an
index built in the foreground. Over time, the compactness of indexes built in the background will approach foreground-
built indexes.
After MongoDB nishes building the index, background-built indexes are functionally identical to any other index.
Procedure
To create an index in the background, add the background argument to the ensureIndex() operation, as in the
following index:
db.collection.ensureIndex( { a: 1 }, { background: true } )
Consider the section on background index construction (page 28) for more information about these indexes and their
implications.
Build Old Style Indexes
Important: Use this procedure only if you must have indexes that are compatible with a version of MongoDB earlier
than 2.0.
MongoDB version 2.0 introduced the {v:1} index format. MongoDB versions 2.0 and later support both the {v:1}
format and the earlier {v:0} format.
MongoDB versions prior to 2.0, however, support only the {v:0} format. If you need to roll back MongoDB to a
version prior to 2.0, you must drop and re-create your indexes.
To build pre-2.0 indexes, use the dropIndexes() and ensureIndex() methods. You cannot simply reindex the
collection. When you reindex on versions that only support {v:0} indexes, the v elds in the index denition still
hold values of 1, even though the indexes would now use the {v:0} format. If you were to upgrade again to version
2.0 or later, these indexes would not work.
Example
39
Suppose you rolled back from MongoDB 2.0 to MongoDB 1.8, and suppose you had the following index on the
items collection:
{ "v" : 1, "key" : { "name" : 1 }, "ns" : "mydb.items", "name" : "name_1" }
The v eld tells you the index is a {v:1} index, which is incompatible with version 1.8.
To drop the index, issue the following command:
db.items.dropIndex( { name : 1 } )
To recreate the index as a {v:0} index, issue the following command:
db.foo.ensureIndex( { name : 1 } , { v : 0 } )
See also:
2.0-new-index-format.
3.2 Index Management Tutorials
Instructions for managing indexes and assessing index performance and use.
Remove Indexes (page 40) Drop an index from a collection.
Rebuild Indexes (page 41) In a single operation, drop all indexes on a collection and then rebuild them.
Manage In-Progress Index Creation (page 41) Check the status of indexing progress, or terminate an ongoing index
build.
Return a List of All Indexes (page 42) Obtain a list of all indexes on a collection or of all indexes on all collections
in a database.
Measure Index Use (page 42) Study query operations and observe index use for your database.
Remove Indexes
To remove an index from a collection use the dropIndex() method and the following procedure. If you simply
need to rebuild indexes you can use the process described in the Rebuild Indexes (page 41) document.
See also:
Indexing Tutorials (page 32) and Index Concepts (page 8) for more information about indexes and indexing operations
in MongoDB.
Operations
To remove an index, use the db.collection.dropIndex() method, as in the following example:
db.accounts.dropIndex( { "tax-id": 1 } )
This will remove the index on the "tax-id" eld in the accounts collection. The shell provides the following
document after completing the operation:
{ "nIndexesWas" : 3, "ok" : 1 }
40
Where the value of nIndexesWas reects the number of indexes before removing this index. You can also use the
db.collection.dropIndexes() to remove all indexes, except for the _id index (page 10) from a collection.
These shell helpers provide wrappers around the dropIndexes database command. Your client library may
have a different or additional interface for these operations.
Rebuild Indexes
If you need to rebuild indexes for a collection you can use the db.collection.reIndex() method to rebuild all
indexes on a collection in a single operation. This operation drops all indexes, including the _id index (page 10), and
then rebuilds all indexes.
See also:
Index Concepts (page 8) and Indexing Tutorials (page 32).
Process
The operation takes the following form:
db.accounts.reIndex()
MongoDB will return the following document when the operation completes:
{
"nIndexesWas" : 2,
"msg" : "indexes dropped for collection",
"nIndexes" : 2,
"indexes" : [
{
"key" : {
"_id" : 1,
"tax-id" : 1
},
"ns" : "records.accounts",
"name" : "_id_"
}
],
"ok" : 1
}
This shell helper provides a wrapper around the reIndex database command. Your client library may have
a different or additional interface for this operation.
Additional Considerations
Note: To build or rebuild indexes for a replica set see Build Indexes on Replica Sets (page 37).
Manage In-Progress Index Creation
To see the status of the indexing processes, you can use the db.currentOp() method in the mongo shell. The
value of the query eld and the msg eld will indicate if the operation is an index build. The msg eld also indicates
the percent of the build that is complete.
41
To terminate an ongoing index build, use the db.killOp() method in the mongo shell.
For more information see db.currentOp().
Changed in version 2.4: Before MongoDB 2.4, you could only terminate background index builds. After 2.4, you can
terminate any index build, including foreground index builds.
Return a List of All Indexes
When performing maintenance you may want to check which indexes exist on a collection. Every index on a collection
has a corresponding document in the system.indexes collection, and you can use standard queries (i.e. find())
to list the indexes, or in the mongo shell, the getIndexes() method to return a list of the indexes on a collection,
as in the following examples.
See also:
Index Concepts (page 8) and Indexing Tutorials (page 32) for more information about indexes in MongoDB and
common index management operations.
List all Indexes on a Collection
To return a list of all indexes on a collection, use the db.collection.getIndexes() method or a similar
method for your driver
12
.
For example, to view all indexes on the people collection:
db.people.getIndexes()
List all Indexes for a Database
To return a list of all indexes on all collections in a database, use the following operation in the mongo shell:
db.system.indexes.find()
See system.indexes for more information about these documents.
Measure Index Use
Synopsis
Query performance is a good general indicator of index use; however, for more precise insight into index use, Mon-
goDB provides a number of tools that allow you to study query operations and observe index use for your database.
See also:
Index Concepts (page 8) and Indexing Tutorials (page 32) for more information.
Operations
Return Query Plan with explain() Append the explain() method to any cursor (e.g. query) to return a
document with statistics about the query process, including the index used, the number of documents scanned, and the
time the query takes to process in milliseconds.
12
http://api.mongodb.org/
42
Control Index Use with hint() Append the hint() to any cursor (e.g. query) with the index as the argument to
force MongoDB to use a specic index to fulll the query. Consider the following example:
db.people.find( { name: "John Doe", zipcode: { $gt: "63000" } } } ).hint( { zipcode: 1 } )
You can use hint() and explain() in conjunction with each other to compare the effectiveness of a specic
index. Specify the $natural operator to the hint() method to prevent MongoDB from using any index:
db.people.find( { name: "John Doe", zipcode: { $gt: "63000" } } } ).hint( { $natural: 1 } )
Instance Index Use Reporting MongoDB provides a number of metrics of index use and operation that you may
want to consider when analyzing index use for your database:
In the output of serverStatus:
indexCounters
scanned
scanAndOrder
In the output of collStats:
totalIndexSize
indexSizes
In the output of dbStats:
dbStats.indexes
dbStats.indexSize
3.3 Geospatial Index Tutorials
Instructions for creating and querying 2d, 2dsphere, and haystack indexes.
Create a 2dsphere Index (page 43) A 2dsphere index supports data stored as both GeoJSON objects and as legacy
coordinate pairs.
Query a 2dsphere Index (page 44) Search for locations within, near, or intersected by a GeoJSON shape, or within a
circle as dened by coordinate points on a sphere.
Create a 2d Index (page 46) Create a 2d index to support queries on data stored as legacy coordinate pairs.
Query a 2d Index (page 47) Search for locations using legacy coordinate pairs.
Create a Haystack Index (page 49) A haystack index is optimized to return results over small areas. For queries that
use spherical geometry, a 2dsphere index is a better option.
Query a Haystack Index (page 49) Search based on location and non-location data within a small area.
Calculate Distance Using Spherical Geometry (page 50) Convert distances to radians and back again.
Create a 2dsphere Index
To create a geospatial index for GeoJSON-formatted data, use the ensureIndex() method and set the value of the
location eld for your collection to 2dsphere. A 2dsphere index can be a compound index (page 11) and does
not require the location eld to be the rst eld indexed.
To create the index use the following syntax:
43
db.points.ensureIndex( { <location field> : "2dsphere" } )
The following are four example commands for creating a 2dsphere index:
db.points.ensureIndex( { loc : "2dsphere" } )
db.points.ensureIndex( { loc : "2dsphere" , type : 1 } )
db.points.ensureIndex( { rating : 1 , loc : "2dsphere" } )
db.points.ensureIndex( { loc : "2dsphere" , rating : 1 , category : -1 } )
The rst example creates a simple geospatial index on the location eld loc. The second example creates a compound
index where the second eld contains non-location data. The third example creates an index where the location eld
is not the primary eld: the location eld does not have to be the rst eld in a 2dsphere index. The fourth example
creates a compound index with three elds. You can include as many elds as you like in a 2dsphere index.
Query a 2dsphere Index
The following sections describe queries supported by the 2dsphere index. For an overview of recommended geospa-
tial queries, see geospatial-query-compatibility-chart.
GeoJSON Objects Bounded by a Polygon
The $geoWithin operator queries for location data found within a GeoJSON polygon. Your location data must be
stored in GeoJSON format. Use the following syntax:
db.<collection>.find( { <location field> :
{ $geoWithin :
{ $geometry :
{ type : "Polygon" ,
coordinates : [ <coordinates> ]
} } } } )
The following example selects all points and shapes that exist entirely within a GeoJSON polygon:
db.places.find( { loc :
{ $geoWithin :
{ $geometry :
{ type : "Polygon" ,
coordinates : [ [
[ 0 , 0 ] ,
[ 3 , 6 ] ,
[ 6 , 1 ] ,
[ 0 , 0 ]
] ]
} } } } )
Intersections of GeoJSON Objects
New in version 2.4.
The $geoIntersects operator queries for locations that intersect a specied GeoJSON object. A location inter-
sects the object if the intersection is non-empty. This includes documents that have a shared edge.
The $geoIntersects operator uses the following syntax:
44
db.<collection>.find( { <location field> :
{ $geoIntersects :
{ $geometry :
{ type : "<GeoJSON object type>" ,
coordinates : [ <coordinates> ]
} } } } )
The following example uses $geoIntersects to select all indexed points and shapes that intersect with the polygon
dened by the coordinates array.
db.places.find( { loc :
{ $geoIntersects :
{ $geometry :
{ type : "Polygon" ,
coordinates: [ [
[ 0 , 0 ] ,
[ 3 , 6 ] ,
[ 6 , 1 ] ,
[ 0 , 0 ]
] ]
} } } } )
Proximity to a GeoJSON Point
Proximity queries return the points closest to the dened point and sorts the results by distance. A proximity query on
GeoJSON data requires a 2dsphere index.
To query for proximity to a GeoJSON point, use either the $near operator or geoNear command. Distance is in
meters.
The $near uses the following syntax:
db.<collection>.find( { <location field> :
{ $near :
{ $geometry :
{ type : "Point" ,
coordinates : [ <longitude> , <latitude> ] } ,
$maxDistance : <distance in meters>
} } } )
For examples, see $near.
The geoNear command uses the following syntax:
db.runCommand( { geoNear : <collection> ,
near : { type : "Point" ,
coordinates: [ <longitude>, <latitude> ] } ,
spherical : true } )
The geoNear command offers more options and returns more information than does the $near operator. To run the
command, see geoNear.
Points within a Circle Dened on a Sphere
To select all grid coordinates in a spherical cap on a sphere, use $geoWithin with the $centerSphere operator.
Specify an array that contains:
45
The grid coordinates of the circles center point
The circles radius measured in radians. To calculate radians, see Calculate Distance Using Spherical Geometry
(page 50).
Use the following syntax:
db.<collection>.find( { <location field> :
{ $geoWithin :
{ $centerSphere :
[ [ <x>, <y> ] , <radius> ] }
} } )
The following example queries grid coordinates and returns all documents within a 10 mile radius of longitude 88 W
and latitude 30 N. The example converts the distance, 10 miles, to radians by dividing by the approximate radius of
the earth, 3959 miles:
db.places.find( { loc :
{ $geoWithin :
{ $centerSphere :
[ [ -88 , 30 ] , 10 / 3959 ]
} } } )
Create a 2d Index
To build a geospatial 2d index, use the ensureIndex() method and specify 2d. Use the following syntax:
db.<collection>.ensureIndex( { <location field> : "2d" ,
<additional field> : <value> } ,
{ <index-specification options> } )
The 2d index uses the following optional index-specication options:
{ min : <lower bound> , max : <upper bound> ,
bits : <bit precision> }
Dene Location Range for a 2d Index
By default, a 2d index assumes longitude and latitude and has boundaries of -180 inclusive and 180 non-inclusive
(i.e. [ -180 , 180 )). If documents contain coordinate data outside of the specied range, MongoDB returns an
error.
Important: The default boundaries allow applications to insert documents with invalid latitudes greater than 90 or
less than -90. The behavior of geospatial queries with such invalid points is not dened.
On 2d indexes you can change the location range.
You can build a 2d geospatial index with a location range other than the default. Use the min and max options when
creating the index. Use the following syntax:
db.collection.ensureIndex( { <location field> : "2d" } ,
{ min : <lower bound> , max : <upper bound> } )
46
Dene Location Precision for a 2d Index
By default, a 2d index on legacy coordinate pairs uses 26 bits of precision, which is roughly equivalent to 2 feet or 60
centimeters of precision using the default range of -180 to 180. Precision is measured by the size in bits of the geohash
values used to store location data. You can congure geospatial indexes with up to 32 bits of precision.
Index precision does not affect query accuracy. The actual grid coordinates are always used in the nal query process-
ing. Advantages to lower precision are a lower processing overhead for insert operations and use of less space. An
advantage to higher precision is that queries scan smaller portions of the index to return results.
To congure a location precision other than the default, use the bits option when creating the index. Use following
syntax:
db.<collection>.ensureIndex( {<location field> : "<index type>"} ,
{ bits : <bit precision> } )
For information on the internals of geohash values, see Calculation of Geohash Values for 2d Indexes (page 22).
Query a 2d Index
The following sections describe queries supported by the 2d index. For an overview of recommended geospatial
queries, see geospatial-query-compatibility-chart.
Points within a Shape Dened on a Flat Surface
To select all legacy coordinate pairs found within a given shape on a at surface, use the $geoWithin operator along
with a shape operator. Use the following syntax:
db.<collection>.find( { <location field> :
{ $geoWithin :
{ $box|$polygon|$center : <coordinates>
} } } )
The following queries for documents within a rectangle dened by [ 0 , 0 ] at the bottom left corner and by [
100 , 100 ] at the top right corner.
db.places.find( { loc :
{ $geoWithin :
{ $box : [ [ 0 , 0 ] ,
[ 100 , 100 ] ]
} } } )
The following queries for documents that are within the circle centered on [ -74 , 40.74 ] and with a radius of
10:
db.places.find( { loc: { $geoWithin :
{ $center : [ [-74, 40.74 ] , 10 ]
} } } )
For syntax and examples for each shape, see the following:
$box
$polygon
$center (denes a circle)
47
Points within a Circle Dened on a Sphere
MongoDB supports rudimentary spherical queries on at 2d indexes for legacy reasons. In general, spherical calcula-
tions should use a 2dsphere index, as described in 2dsphere Indexes (page 18).
To query for legacy coordinate pairs in a spherical cap on a sphere, use $geoWithin with the $centerSphere
operator. Specify an array that contains:
The grid coordinates of the circles center point
The circles radius measured in radians. To calculate radians, see Calculate Distance Using Spherical Geometry
(page 50).
Use the following syntax:
db.<collection>.find( { <location field> :
{ $geoWithin :
{ $centerSphere : [ [ <x>, <y> ] , <radius> ] }
} } )
The following example query returns all documents within a 10-mile radius of longitude 88 W and latitude 30 N.
The example converts distance to radians by dividing distance by the approximate radius of the earth, 3959 miles:
db.<collection>.find( { loc : { $geoWithin :
{ $centerSphere :
[ [ 88 , 30 ] , 10 / 3959 ]
} } } )
Proximity to a Point on a Flat Surface
Proximity queries return the 100 legacy coordinate pairs closest to the dened point and sort the results by distance.
Use either the $near operator or geoNear command. Both require a 2d index.
The $near operator uses the following syntax:
db.<collection>.find( { <location field> :
{ $near : [ <x> , <y> ]
} } )
For examples, see $near.
The geoNear command uses the following syntax:
db.runCommand( { geoNear: <collection>, near: [ <x> , <y> ] } )
The geoNear command offers more options and returns more information than does the $near operator. To run the
command, see geoNear.
Exact Matches on a Flat Surface
You can use the db.collection.find() method to query for an exact match on a location. These queries use
the following syntax:
db.<collection>.find( { <location field>: [ <x> , <y> ] } )
This query will return any documents with the value of [ <x> , <y> ].
48
Create a Haystack Index
To build a haystack index, use the bucketSize option when creating the index. A bucketSize of 5 creates an
index that groups location values that are within 5 units of the specied longitude and latitude. The bucketSize also
determines the granularity of the index. You can tune the parameter to the distribution of your data so that in general
you search only very small regions. The areas dened by buckets can overlap. A document can exist in multiple
buckets.
A haystack index can reference two elds: the location eld and a second eld. The second eld is used for exact
matches. Haystack indexes return documents based on location and an exact match on a single additional criterion.
These indexes are not necessarily suited to returning the closest documents to a particular location.
To build a haystack index, use the following syntax:
db.coll.ensureIndex( { <location field> : "geoHaystack" ,
<additional field> : 1 } ,
{ bucketSize : <bucket value> } )
Example
If you have a collection with documents that contain elds similar to the following:
{ _id : 100, pos: { lng : 126.9, lat : 35.2 } , type : "restaurant"}
{ _id : 200, pos: { lng : 127.5, lat : 36.1 } , type : "restaurant"}
{ _id : 300, pos: { lng : 128.0, lat : 36.7 } , type : "national park"}
The following operations create a haystack index with buckets that store keys within 1 unit of longitude or latitude.
db.places.ensureIndex( { pos : "geoHaystack", type : 1 } ,
{ bucketSize : 1 } )
This index stores the document with an _id eld that has the value 200 in two different buckets:
In a bucket that includes the document where the _id eld has a value of 100
In a bucket that includes the document where the _id eld has a value of 300
To query using a haystack index you use the geoSearch command. See Query a Haystack Index (page 49).
By default, queries that use a haystack index return 50 documents.
Query a Haystack Index
A haystack index is a special 2d geospatial index that is optimized to return results over small areas. To create a
haystack index see Create a Haystack Index (page 49).
To query a haystack index, use the geoSearch command. You must specify both the coordinates and the additional
eld to geoSearch. For example, to return all documents with the value restaurant in the type eld near the
example point, the command would resemble:
db.runCommand( { geoSearch : "places" ,
search : { type: "restaurant" } ,
near : [-74, 40.74] ,
maxDistance : 10 } )
Note: Haystack indexes are not suited to queries for the complete list of documents closest to a particular location.
The closest documents could be more distant compared to the bucket size.
49
Note: Spherical query operations (page 50) are not currently supported by haystack indexes.
The find() method and geoNear command cannot access the haystack index.
Calculate Distance Using Spherical Geometry
Note: While basic queries using spherical distance are supported by the 2d index, consider moving to a 2dsphere
index if your data is primarily longitude and latitude.
The 2d index supports queries that calculate distances on a Euclidean plane (at surface). The index also supports the
following query operators and command that calculate distances using spherical geometry:
$nearSphere
$centerSphere
$near
geoNear command with the { spherical: true } option.
Important: These three queries use radians for distance. Other query types do not.
For spherical query operators to function properly, you must convert distances to radians, and convert from radians to
the distances units used by your application.
To convert:
distance to radians: divide the distance by the radius of the sphere (e.g. the Earth) in the same units as the
distance measurement.
radians to distance: multiply the radian measure by the radius of the sphere (e.g. the Earth) in the units system
that you want to convert the distance to.
The radius of the Earth is approximately 3,959 miles or 6,371 kilometers.
The following query would return documents from the places collection within the circle described by the center [
-74, 40.74 ] with a radius of 100 miles:
db.places.find( { loc: { $geoWithin: { $centerSphere: [ [ -74, 40.74 ] ,
100 / 3959 ] } } } )
You may also use the distanceMultiplier option to the geoNear to convert radians in the mongod process,
rather than in your application code. See distance multiplier (page 51).
The following spherical query, returns all documents in the collection places within 100 miles from the point [
-74, 40.74 ].
db.runCommand( { geoNear: "places",
near: [ -74, 40.74 ],
spherical: true
} )
The output of the above command would be:
{
// [ ... ]
"results" : [
{
"dis" : 0.01853688938212826,
"obj" : {
50
"_id" : ObjectId( ... )
"loc" : [
-73,
40
]
}
}
],
"stats" : {
// [ ... ]
"avgDistance" : 0.01853688938212826,
"maxDistance" : 0.01853714811400047
},
"ok" : 1
}
Warning: Spherical queries that wrap around the poles or at the transition from -180 to 180 longitude raise an
error.
Note: While the default Earth-like bounds for geospatial indexes are between -180 inclusive, and 180, valid values
for latitude are between -90 and 90.
Distance Multiplier
The distanceMultiplier option of the geoNear command returns distances only after multiplying the results
by an assigned value. This allows MongoDB to return converted values, and removes the requirement to convert units
in application logic.
Using distanceMultiplier in spherical queries provides results from the geoNear command that do not need
radian-to-distance conversion. The following example uses distanceMultiplier in the geoNear command
with a spherical (page 50) example:
db.runCommand( { geoNear: "places",
near: [ -74, 40.74 ],
spherical: true,
distanceMultiplier: 3959
} )
The output of the above operation would resemble the following:
{
// [ ... ]
"results" : [
{
"dis" : 73.46525170413567,
"obj" : {
"_id" : ObjectId( ... )
"loc" : [
-73,
40
]
}
}
],
"stats" : {
51
// [ ... ]
"avgDistance" : 0.01853688938212826,
"maxDistance" : 0.01853714811400047
},
"ok" : 1
}
3.4 Text Search Tutorials
Instructions for enabling MongoDBs text search feature, and for building and conguring text indexes.
Create a text Index (page 52) A text index allows searches on text strings in the indexs specied elds.
Specify a Language for Text Index (page 53) The specied language determines the list of stop words and the rules
for Text Searchs stemmer and tokenizer.
Create text Index with Long Name (page 55) Override the text index name limit for long index names.
Control Search Results with Weights (page 55) Give priority to certain search values by denoting the signicance of
an indexed eld relative to other indexed elds
Limit the Number of Entries Scanned (page 56) Create an index to support queries that includes $text expressions
and equality conditions.
Text Search in the Aggregation Pipeline (page 57) Perform various text search in the aggregation pipeline.
Create a text Index
You can create a text index on the eld or elds whose value is a string or an array of string elements. When creating
a text index on multiple elds, you can specify the individual elds or you can use wildcard specier ($
**
).
Index Specic Fields
The following example creates a text index on the elds subject and content:
db.collection.ensureIndex(
{
subject: "text",
content: "text"
}
)
This text index catalogs all string data in the subject eld and the content eld, where the eld value is either
a string or an array of string elements.
Index All Fields
To allow for text search on all elds with string content, use the wildcard specier ($
**
) to index all elds that contain
string content.
The following example indexes any string value in the data of every eld of every document in collection and
names the index TextIndex:
52
db.collection.ensureIndex(
{ "$
**
": "text" },
{ name: "TextIndex" }
)
Specify a Language for Text Index
This tutorial describes how to specify the default language associated with the text index (page 53) and also how to
create text indexes for collections that contain documents in different languages (page 53).
Specify the Default Language for a text Index
The default language associated with the indexed data determines the rules to parse word roots (i.e. stemming) and
ignore stop words. The default language for the indexed data is english.
To specify a different language, use the default_language option when creating the text index. See Text Search
Languages (page 67) for the languages available for default_language.
The following example creates for the quotes collection a text index on the content eld and sets the
default_language to spanish:
db.quotes.ensureIndex(
{ content : "text" },
{ default_language: "spanish" }
)
Create a text Index for a Collection in Multiple Languages
Changed in version 2.6: Added support for language overrides within sub-documents.
Specify the Index Language within the Document If a collection contains documents or sub-documents that are
in different languages, include a eld named language in the documents or sub-documents and specify as its value
the language for that document or sub-document.
MongoDB will use the specied language for that document or sub-document when building the text index:
The specied language in the document overrides the default language for the text index.
The specied language in a sub-document override the language specied in an enclosing document or the
default language for the index.
See Text Search Languages (page 67) for a list of supported languages.
For example, a collection quotes contains multi-language documents that include the language eld in the docu-
ment and/or the sub-document as needed:
{
_id: 1,
language: "portuguese",
original: "A sorte protege os audazes.",
translation:
[
{
language: "english",
quote: "Fortune favors the bold."
53
},
{
language: "spanish",
quote: "Suerte protege a los audaces."
}
]
}
{
_id: 2,
language: "spanish",
original: "Nada hay ms surreal que la realidad.",
translation:
[
{
language: "english",
quote: "There is nothing more surreal than reality."
},
{
language: "french",
quote: "Il n'ya rien de plus surraliste que la ralit."
}
]
}
{
_id: 3,
original: "is this a dagger which I see before me.",
translation:
{
language: "spanish",
quote: "Es este un pual que veo delante de m."
}
}
If you create a text index on the quote eld with the default language of English.
db.quotes.ensureIndex( { original: "text", "translation.quote": "text" } )
Then, for the documents and subdocuments that contain the language eld, the text index uses that language to
parse word stems and other linguistic characteristics.
For sub-documents that do not contain the language eld,
If the enclosing document contains the language eld, then the index uses the documents language for the
sub-document.
Otherwise, the index uses the default language for the sub-documents.
For documents that do not contain the language eld, the index uses the default language, which is English.
Use any Field to Specify the Language for a Document To use a eld with a name other than language, include
the language_override option when creating the index.
For example, give the following command to use idioma as the eld name instead of language:
db.quotes.ensureIndex( { quote : "text" },
{ language_override: "idioma" } )
The documents of the quotes collection may specify a language with the idioma eld:
54
{ _id: 1, idioma: "portuguese", quote: "A sorte protege os audazes" }
{ _id: 2, idioma: "spanish", quote: "Nada hay ms surreal que la realidad." }
{ _id: 3, idioma: "english", quote: "is this a dagger which I see before me" }
Create text Index with Long Name
The default name for the index consists of each indexed eld name concatenated with _text. For example, the
following command creates a text index on the elds content, users.comments, and users.profiles:
db.collection.ensureIndex(
{
content: "text",
"users.comments": "text",
"users.profiles": "text"
}
)
The default name for the index is:
"content_text_users.comments_text_users.profiles_text"
To avoid creating an index with a name that exceeds the index name length limit, you can pass the name
option to the db.collection.ensureIndex() method:
db.collection.ensureIndex(
{
content: "text",
"users.comments": "text",
"users.profiles": "text"
},
{
name: "MyTextIndex"
}
)
Note: To drop the text index, use the index name. To get the name of an index, use
db.collection.getIndexes().
Control Search Results with Weights
This document describes how to create a text index with specied weights for results elds.
For a text index, the weight of an indexed eld denotes the signicance of the eld relative to the other indexed elds
in terms of the score. The score for a given word in a document is derived from the weighted sum of the frequency for
each of the indexed elds in that document. See $meta operator for details on returning and sorting by text scores.
The default weight is 1 for the indexed elds. To adjust the weights for the indexed elds, include the weights
option in the db.collection.ensureIndex() method.
Warning: Choose the weights carefully in order to prevent the need to reindex.
A collection blog has the following documents:
55
{ _id: 1,
content: "This morning I had a cup of coffee.",
about: "beverage",
keywords: [ "coffee" ]
}
{ _id: 2,
content: "Who doesn't like cake?",
about: "food",
keywords: [ "cake", "food", "dessert" ]
}
To create a text index with different eld weights for the content eld and the keywords eld, include the
weights option to the ensureIndex() method. For example, the following command creates an index on three
elds and assigns weights to two of the elds:
db.blog.ensureIndex(
{
content: "text",
keywords: "text",
about: "text"
},
{
weights: {
content: 10,
keywords: 5,
},
name: "TextIndex"
}
)
The text index has the following elds and weights:
content has a weight of 10,
keywords has a weight of 5, and
about has the default weight of 1.
These weights denote the relative signicance of the indexed elds to each other. For instance, a term match in the
content eld has:
2 times (i.e. 10:5) the impact as a term match in the keywords eld and
10 times (i.e. 10:1) the impact as a term match in the about eld.
Limit the Number of Entries Scanned
This tutorial describes how to create indexes to limit the number of index entries scanned for queries that includes a
$text expression and equality conditions.
A collection inventory contains the following documents:
{ _id: 1, dept: "tech", description: "lime green computer" }
{ _id: 2, dept: "tech", description: "wireless red mouse" }
{ _id: 3, dept: "kitchen", description: "green placemat" }
{ _id: 4, dept: "kitchen", description: "red peeler" }
{ _id: 5, dept: "food", description: "green apple" }
{ _id: 6, dept: "food", description: "red potato" }
56
Consider the common use case that performs text searches by individual departments, such as:
db.inventory.find( { dept: "kitchen", $text: { $search: "green" } } )
To limit the text search to scan only those documents within a specic dept, create a compound index that rst spec-
ies an ascending/descending index key on the eld dept and then a text index key on the eld description:
db.inventory.ensureIndex(
{
dept: 1,
description: "text"
}
)
Then, the text search
13
within a particular department will limit the scan of indexed documents. For example, the
following query scans only those documents with dept equal to kitchen or food:
db.inventory.find( { dept: "kitchen", $text: { $search: "green" } } )
A compound text index cannot include any other special index types, such as multi-key (page 13) or geospatial
(page 17) index elds.
If the compound text index includes keys preceding the text index key, to perform a $text search, the query
predicate must include equality match conditions on the preceding keys.
See also:
Text Indexes (page 23)
Text Search in the Aggregation Pipeline
New in version 2.6. In the aggregation pipeline, text search is available via the use of the $text query operator in
the $match stage.
Restrictions
Text search in the aggregation pipeline has the following restrictions:
The $match stage that includes a $text must be the rst stage in the pipeline.
A text operator can only occur once in the stage.
The text operator expression cannot appear in $or or $not expressions.
The text search, by default, does not return the matching documents in order of matching scores. Use the $meta
aggregation expression in the $sort stage.
Text Score
The $text operator assigns a score to each document that contains the search term in the indexed elds. The score
represents the relevance of a document to a given text search query. The score can be part of a $sort pipeline
specication as well as part of the projection expression. The { $meta: "textScore" } expression provides
information on the processing of the $text operation. See $meta aggregation for details on accessing the score for
projection or sort.
The metadata is only available after the $match stage that includes the $text operation.
13
If using the deprecated text command, the text command must include the filter option that species an equality condition for the
prex elds.
57
Examples The following examples assume a collection articles that has a text index on the eld subject:
db.articles.ensureIndex( { subject: "text" } )
Calculate the Total Views for Articles that Contains a Word
The following aggregation searches for the term cake in the $match stage and calculates the total views for the
matching documents in the $group stage.
db.articles.aggregate(
[
{ $match: { $text: { $search: "cake" } } },
{ $group: { _id: null, views: { $sum: "$views" } } }
]
)
Return Results Sorted by Text Search Score
To sort by the text search score, include a $meta expression in the $sort stage. The following example matches on
either the term cake or tea, sorts by the textScore in descending order, and returns only the title eld in the
results set.
db.articles.aggregate(
[
{ $match: { $text: { $search: "cake tea" } } },
{ $sort: { score: { $meta: "textScore" } } },
{ $project: { title: 1, _id: 0 } }
]
)
The specied metadata determines the sort order. For example, the "textScore" metadata sorts in descending
order. See $meta for more information on metadata as well as an example of overriding the default sort order of the
metadata.
Match on Text Score
The "textScore" metadata is available for projections, sorts, and conditions subsequent the $match stage that
includes the $text operation.
The following example matches on either the term cake or tea, projects the title and the score elds, and then
returns only those documents with a score greater than 1.0.
db.articles.aggregate(
[
{ $match: { $text: { $search: "cake tea" } } },
{ $project: { title: 1, _id: 0, score: { $meta: "textScore" } } },
{ $match: { score: { $gt: 1.0 } } }
]
)
Specify a Language for Text Search
The following aggregation searches in spanish for documents that contain the term saber but not the term claro in
the $match stage and calculates the total views for the matching documents in the $group stage.
58
db.articles.aggregate(
[
{ $match: { $text: { $search: "saber -claro", $language: "es" } } },
{ $group: { _id: null, views: { $sum: "$views" } } }
]
)
3.5 Indexing Strategies
The best indexes for your application must take a number of factors into account, including the kinds of queries you
expect, the ratio of reads to writes, and the amount of free memory on your system.
When developing your indexing strategy you should have a deep understanding of your applications queries. Before
you build indexes, map out the types of queries you will run so that you can build indexes that reference those elds.
Indexes come with a performance cost, but are more than worth the cost for frequent queries on large data set. Consider
the relative frequency of each query in the application and whether the query justies an index.
The best overall strategy for designing indexes is to prole a variety of index congurations with data sets similar to
the ones youll be running in production to see which congurations perform best.Inspect the current indexes created
for your collections to ensure they are supporting your current and planned queries. If an index is no longer used, drop
the index.
MongoDB can only use one index to support any given operation. However, each clause of an $or query may use a
different index.
The following documents introduce indexing strategies:
Create Indexes to Support Your Queries (page 59) An index supports a query when the index contains all the elds
scanned by the query. Creating indexes that supports queries results in greatly increased query performance.
Use Indexes to Sort Query Results (page 61) To support efcient queries, use the strategies here when you specify
the sequential order and sort order of index elds.
Ensure Indexes Fit in RAM (page 63) When your index ts in RAM, the system can avoid reading the index from
disk and you get the fastest processing.
Create Queries that Ensure Selectivity (page 64) Selectivity is the ability of a query to narrowresults using the index.
Selectivity allows MongoDB to use the index for a larger portion of the work associated with fullling the query.
Create Indexes to Support Your Queries
An index supports a query when the index contains all the elds scanned by the query. The query scans the index and
not the collection. Creating indexes that support queries results in greatly increased query performance.
This document describes strategies for creating indexes that support queries.
Create a Single-Key Index if All Queries Use the Same, Single Key
If you only ever query on a single key in a given collection, then you need to create just one single-key index for that
collection. For example, you might create an index on category in the product collection:
db.products.ensureIndex( { "category": 1 } )
59
Create Compound Indexes to Support Several Different Queries
If you sometimes query on only one key and at other times query on that key combined with a second key, then creating
a compound index is more efcient than creating a single-key index. MongoDB will use the compound index for both
queries. For example, you might create an index on both category and item.
db.products.ensureIndex( { "category": 1, "item": 1 } )
This allows you both options. You can query on just category, and you also can query on category combined
with item. A single compound index (page 11) on multiple elds can support all the queries that search a prex
subset of those elds.
Example
The following index on a collection:
{ x: 1, y: 1, z: 1 }
Can support queries that the following indexes support:
{ x: 1 }
{ x: 1, y: 1 }
There are some situations where the prex indexes may offer better query performance: for example if z is a large
array.
The { x: 1, y: 1, z: 1 } index can also support many of the same queries as the following index:
{ x: 1, z: 1 }
Also, { x: 1, z: 1 } has an additional use. Given the following query:
db.collection.find( { x: 5 } ).sort( { z: 1} )
The { x: 1, z: 1 } index supports both the query and the sort operation, while the { x: 1, y: 1,
z: 1 } index only supports the query. For more information on sorting, see Use Indexes to Sort Query Results
(page 61).
Starting in version 2.6, MongoDB can use index intersection (page 31) to fulll queries. The choice between creating
compound indexes that support your queries or relying on index intersection depends on the specics of your system.
See Index Intersection and Compound Indexes (page 31) for more details.
Create Indexes that Support Covered Queries
A covered query is a query in which:
all the elds in the query are part of an index, and
all the elds returned in the results are in the same index.
Because the index covers the query, MongoDB can both match the query conditions and return the results using
only the index; MongoDB does not need to look at the documents, only the index, to fulll the query. An index can
also cover an aggregation pipeline operation on unsharded collections.
Querying only the index can be much faster than querying documents outside of the index. Index keys are typically
smaller than the documents they catalog, and indexes are typically available in RAM or located sequentially on disk.
MongoDB automatically uses an index that covers a query when possible. To ensure that an index can cover a query,
create an index that includes all the elds listed in the query document and in the query result. You can specify the
60
elds to return in the query results with a projection document. By default, MongoDB includes the _id eld in the
query result. So, if the index does not include the _id eld, then you must exclude the _id eld (i.e. _id: 0)
from the query results.
Example
Given collection users with an index on the elds user and status, as created by the following option:
db.users.ensureIndex( { status: 1, user: 1 } )
Then, this index will cover the following query which selects on the status eld and returns only the user eld:
db.users.find( { status: "A" }, { user: 1, _id: 0 } )
In the operation, the projection document explicitly species _id: 0 to exclude the _id eld from the result since
the index is only on the status and the user elds.
If the projection document does not specify the exclusion of the _id eld, the query returns the _id eld. The
following query is not covered by the index on the status and the user elds because with the projection document
{ user: 1 }, the query returns both the user eld and the _id eld:
db.users.find( { status: "A" }, { user: 1 } )
An index cannot cover a query if:
any of the indexed elds in any of the documents in the collection includes an array. If an indexed eld is an
array, the index becomes a multi-key index (page 13) index and cannot support a covered query.
any of the indexed elds are elds in subdocuments. To index elds in subdocuments, use dot notation. For
example, consider a collection users with documents of the following form:
{ _id: 1, user: { login: "tester" } }
The collection has the following indexes:
{ user: 1 }
{ "user.login": 1 }
The { user: 1 } index covers the following query:
db.users.find( { user: { login: "tester" } }, { user: 1, _id: 0 } )
However, the { "user.login": 1 } index does not cover the following query:
db.users.find( { "user.login": "tester" }, { "user.login": 1, _id: 0 } )
The query, however, does use the { "user.login": 1 } index to nd matching documents.
To determine whether a query is a covered query, use the explain() method. If the explain() output displays
true for the indexOnly eld, the query is covered by an index, and MongoDB queries only that index to match
the query and return the results.
For more information see Measure Index Use (page 42).
Use Indexes to Sort Query Results
In MongoDB sort operations that sort documents based on an indexed eld provide the greatest performance. Indexes
in MongoDB, as in other databases, have an order: as a result, using an index to access documents returns in the same
order as the index.
61
To sort on multiple elds, create a compound index (page 11). With compound indexes, the results can be in the sorted
order of either the full index or an index prex. An index prex is a subset of a compound index; the subset consists
of one or more elds at the start of the index, in order. For example, given an index { a:1, b: 1, c: 1, d:
1 }, the following subsets are index prexes:
{ a: 1 }
{ a: 1, b: 1 }
{ a: 1, b: 1, c: 1 }
For more information on sorting by index prexes, see Sort Subset Starts at the Index Beginning (page 62).
If the query includes equality match conditions on an index prex, you can sort on a subset of the index that starts
after or overlaps with the prex. For example, given an index { a: 1, b: 1, c: 1, d: 1 }, if the
query condition includes equality match conditions on a and b, you can specify a sort on the subsets { c: 1 } or
{ c: 1, d: 1 }:
db.collection.find( { a: 5, b: 3 } ).sort( { c: 1 } )
db.collection.find( { a: 5, b: 3 } ).sort( { c: 1, d: 1 } )
In these operations, the equality match and the sort documents together cover the index prexes { a: 1, b: 1,
c: 1 } and { a: 1, b: 1, c: 1, d: 1 } respectively.
You can also specify a sort order that includes the prex; however, since the query condition species equality matches
on these elds, they are constant in the resulting documents and do not contribute to the sort order:
db.collection.find( { a: 5, b: 3 } ).sort( { a: 1, b: 1, c: 1 } )
db.collection.find( { a: 5, b: 3 } ).sort( { a: 1, b: 1, c: 1, d: 1 } )
For more information on sorting by index subsets that are not prexes, see Sort Subset Does Not Start at the Index
Beginning (page 63).
Note: For in-memory sorts that do not use an index, the sort() operation is signicantly slower. The sort()
operation will abort when it uses 32 megabytes of memory.
Sort With a Subset of Compound Index
If the sort document contains a subset of the compound index elds, the subset can determine whether MongoDB can
use the index efciently to both retrieve and sort the query results. If MongoDB can efciently use the index to both
retrieve and sort the query results, the output from the explain() will display scanAndOrder as false or 0.
If MongoDB can only use the index for retrieving documents that meet the query criteria, MongoDB must manually
sort the resulting documents without the use of the index. For in-memory sort operations, explain() will display
scanAndOrder as true or 1.
Sort Subset Starts at the Index Beginning If the sort document is a subset of a compound index and starts from
the beginning of the index, MongoDB can use the index to both retrieve and sort the query results.
For example, the collection collection has the following index:
{ a: 1, b: 1, c: 1, d: 1 }
The following operations include a sort with a subset of the index. Because the sort subset starts at beginning of the
index, the operations can use the index for both the query retrieval and sort:
db.collection.find().sort( { a:1 } )
db.collection.find().sort( { a:1, b:1 } )
db.collection.find().sort( { a:1, b:1, c:1 } )
62
db.collection.find( { a: 4 } ).sort( { a: 1, b: 1 } )
db.collection.find( { a: { $gt: 4 } } ).sort( { a: 1, b: 1 } )
db.collection.find( { b: 5 } ).sort( { a: 1, b: 1 } )
db.collection.find( { b: { $gt:5 }, c: { $gt: 1 } } ).sort( { a: 1, b: 1 } )
The last two operations include query conditions on the eld b but does not include a query condition on the eld a:
db.collection.find( { b: 5 } ).sort( { a: 1, b: 1 } )
db.collection.find( { b: { $gt:5 }, c: { $gt: 1 } } ).sort( { a: 1, b: 1 } )
Consider the case where the collection has the index { b: 1 } in addition to the { a: 1, b: 1, c: 1,
d: 1 } index. Because of the query condition on b, it is not immediately obvious which index MongoDB may
select as the best index. To explicitly specify the index to use, see hint().
Sort Subset Does Not Start at the Index Beginning The sort document can be a subset of a compound index that
does not start from the beginning of the index. For instance, { c: 1 } is a subset of the index { a: 1, b:
1, c: 1, d: 1 } that omits the preceding index elds a and b. MongoDB can use the index efciently if the
query document includes all the preceding elds of the index, in this case a and b, in equality conditions. In other
words, the equality conditions in the query document and the subset in the sort document contiguously cover a prex
of the index.
For example, the collection collection has the following index:
{ a: 1, b: 1, c: 1, d: 1 }
Then following operations can use the index efciently:
db.collection.find( { a: 5 } ).sort( { b: 1, c: 1 } )
db.collection.find( { a: 5, c: 4, b: 3 } ).sort( { d: 1 } )
In the rst operation, the query document { a: 5 } with the sort document { b: 1, c: 1 } cover
the prex { a:1 , b: 1, c: 1 } of the index.
In the second operation, the query document { a: 5, c: 4, b: 3 } with the sort document { d:
1 } covers the full index.
Only the index elds preceding the sort subset must have the equality conditions in the query document. The other
index elds may have other conditions. The following operations can efciently use the index since the equality
conditions in the query document and the subset in the sort document contiguously cover a prex of the index:
db.collection.find( { a: 5, b: 3 } ).sort( { c: 1 } )
db.collection.find( { a: 5, b: 3, c: { $lt: 4 } } ).sort( { c: 1 } )
The following operations specify a sort document of { c: 1 }, but the query documents do not contain equality
matches on the preceding index elds a and b:
db.collection.find( { a: { $gt: 2 } } ).sort( { c: 1 } )
db.collection.find( { c: 5 } ).sort( { c: 1 } )
These operations will not efciently use the index { a: 1, b: 1, c: 1, d: 1 } and may not even use
the index to retrieve the documents.
Ensure Indexes Fit in RAM
For the fastest processing, ensure that your indexes t entirely in RAM so that the system can avoid reading the index
from disk.
63
To check the size of your indexes, use the db.collection.totalIndexSize() helper, which returns data in
bytes:
> db.collection.totalIndexSize()
4294976499
The above example shows an index size of almost 4.3 gigabytes. To ensure this index ts in RAM, you must not only
have more than that much RAM available but also must have RAM available for the rest of the working set. Also
remember:
If you have and use multiple collections, you must consider the size of all indexes on all collections. The indexes and
the working set must be able to t in memory at the same time.
There are some limited cases where indexes do not need to t in memory. See Indexes that Hold Only Recent Values
in RAM (page 64).
See also:
collStats and db.collection.stats()
Indexes that Hold Only Recent Values in RAM
Indexes do not have to t entirely into RAM in all cases. If the value of the indexed eld increments with every insert,
and most queries select recently added documents; then MongoDB only needs to keep the parts of the index that hold
the most recent or right-most values in RAM. This allows for efcient index use for read and write operations and
minimize the amount of RAM required to support the index.
Create Queries that Ensure Selectivity
Selectivity is the ability of a query to narrow results using the index. Effective indexes are more selective and allow
MongoDB to use the index for a larger portion of the work associated with fullling the query.
To ensure selectivity, write queries that limit the number of possible documents with the indexed eld. Write queries
that are appropriately selective relative to your indexed data.
Example
Suppose you have a eld called status where the possible values are new and processed. If you add an index
on status youve created a low-selectivity index. The index will be of little help in locating records.
A better strategy, depending on your queries, would be to create a compound index (page 11) that includes the low-
selectivity eld and another eld. For example, you could create a compound index on status and created_at.
Another option, again depending on your use case, might be to use separate collections, one for each status.
Example
Consider an index { a : 1 } (i.e. an index on the key a sorted in ascending order) on a collection where a has
three values evenly distributed across the collection:
{ _id: ObjectId(), a: 1, b: "ab" }
{ _id: ObjectId(), a: 1, b: "cd" }
{ _id: ObjectId(), a: 1, b: "ef" }
{ _id: ObjectId(), a: 2, b: "jk" }
{ _id: ObjectId(), a: 2, b: "lm" }
{ _id: ObjectId(), a: 2, b: "no" }
{ _id: ObjectId(), a: 3, b: "pq" }
64
{ _id: ObjectId(), a: 3, b: "rs" }
{ _id: ObjectId(), a: 3, b: "tv" }
If you query for { a: 2, b: "no" } MongoDB must scan 3 documents in the collection to return the one
matching result. Similarly, a query for { a: { $gt: 1}, b: "tv" } must scan 6 documents, also to
return one result.
Consider the same index on a collection where a has nine values evenly distributed across the collection:
{ _id: ObjectId(), a: 1, b: "ab" }
{ _id: ObjectId(), a: 2, b: "cd" }
{ _id: ObjectId(), a: 3, b: "ef" }
{ _id: ObjectId(), a: 4, b: "jk" }
{ _id: ObjectId(), a: 5, b: "lm" }
{ _id: ObjectId(), a: 6, b: "no" }
{ _id: ObjectId(), a: 7, b: "pq" }
{ _id: ObjectId(), a: 8, b: "rs" }
{ _id: ObjectId(), a: 9, b: "tv" }
If you query for { a: 2, b: "cd" }, MongoDB must scan only one document to fulll the query. The index
and query are more selective because the values of a are evenly distributed and the query can select a specic document
using the index.
However, although the index on a is more selective, a query such as { a: { $gt: 5 }, b: "tv" } would
still need to scan 4 documents.
If overall selectivity is low, and if MongoDB must read a number of documents to return results, then some queries
may perform faster without indexes. To determine performance, see Measure Index Use (page 42).
For a conceptual introduction to indexes in MongoDB see Index Concepts (page 8).
65
4 Indexing Reference
4.1 Indexing Methods in the mongo Shell
Name Description
db.collection.createIndex() Builds an index on a collection. Use db.collection.ensureIndex().
db.collection.dropIndex() Removes a specied index on a collection.
db.collection.dropIndexes() Removes all indexes on a collection.
db.collection.ensureIndex() Creates an index if it does not currently exist. If the index exists ensureIndex()
does nothing.
db.collection.getIndexes() Returns an array of documents that describe the existing indexes on a collection.
db.collection.getIndexStats() Renders a human-readable view of the data collected by indexStats which
reects B-tree utilization.
db.collection.indexStats() Renders a human-readable view of the data collected by indexStats which
reects B-tree utilization.
db.collection.reIndex() Rebuilds all existing indexes on a collection.
db.collection.totalIndexSize() Reports the total size used by the indexes on a collection. Provides a wrapper around
the totalIndexSize eld of the collStats output.
cursor.explain() Reports on the query execution plan, including index use, for a cursor.
cursor.hint() Forces MongoDB to use a specic index for a query.
cursor.max() Species an exclusive upper index bound for a cursor. For use with
cursor.hint()
cursor.min() Species an inclusive lower index bound for a cursor. For use with
cursor.hint()
cursor.snapshot() Forces the cursor to use the index on the _id eld. Ensures that the cursor returns
each document, with regards to the value of the _id eld, only once.
4.2 Indexing Database Commands
Name Description
createIndexes Builds one or more indexes for a collection.
dropIndexes Removes indexes from a collection.
compact Defragments a collection and rebuilds the indexes.
reIndex Rebuilds all indexes on a collection.
validate Internal command that scans for a collections data and indexes for correctness.
indexStats Experimental command that collects and aggregates statistics on all indexes.
geoNear Performs a geospatial query that returns the documents closest to a given point.
geoSearch Performs a geospatial query that uses MongoDBs haystack index functionality.
geoWalk An internal command to support geospatial queries.
checkShardingIndex Internal command that validates index on shard key.
4.3 Geospatial Query Selectors
Name Description
$geoWithin Selects geometries within a bounding GeoJSON geometry.
$geoIntersects Selects geometries that intersect with a GeoJSON geometry.
$near Returns geospatial objects in proximity to a point.
$nearSphere Returns geospatial objects in proximity to a point on a sphere.
66
4.4 Indexing Query Modiers
Name Description
$explain Forces MongoDB to report on query execution plans. See explain().
$hint Forces MongoDB to use a specic index. See hint()
$max Species an exclusive upper limit for the index to use in a query. See max().
$min Species an inclusive lower limit for the index to use in a query. See min().
$returnKey Forces the cursor to only return elds included in the index.
$snapshot Forces the query to use the index on the _id eld. See snapshot().
4.5 Other Index References
Text Search Languages (page 67) Supported languages for text indexes (page 23) and $text query operations.
Text Search Languages
The text index (page 23), the $text operator, and the text command
14
support the following languages:
Changed in version 2.6: MongoDB introduces version 2 of the text search feature. With version 2, text search feature
supports using the two-letter language codes dened in ISO 639-1. Version 1 of text search only supported the long
form of each language name.
da or danish
nl or dutch
en or english
fi or finnish
fr or french
de or german
hu or hungarian
it or italian
no or norwegian
pt or portuguese
ro or romanian
ru or russian
es or spanish
sv or swedish
tr or turkish
Note: If you specify a language value of "none", then the text search has no list of stop words, and the text search
does not stem or tokenize the search terms.
14
The text command is deprecated in MongoDB 2.6.
67
Index
Symbols
_id, 10
_id index, 10
C
compound index, 11
G
geospatial queries, 48
exact, 48
I
index
_id, 10
background creation, 28
compound, 11, 34
create, 33, 34
create in background, 39
drop duplicates, 30, 35
duplicates, 30, 35
embedded elds, 10
hashed, 24, 36
list indexes, 42
measure use, 42
monitor index building, 41
multikey, 13
name, 30
options, 28
overview, 3
rebuild, 41
remove, 40
replica set, 37
sort order, 12
sparse, 26, 36
subdocuments, 11
TTL index, 25
unique, 26, 35
index types, 8
primary key, 10
R
replica set
index, 37
T
TTL index, 25
68

You might also like