Module: Database Design & Implementation
Qualification: Advanced Certificate In Web Development
Introduction to MongoDB
By the end of this tutorial you will be able to understand Mongo DB, using it in python & basic administration.
Contents
S. No. Topic Description Required / Optional
01 Introduction Required
03 MongoDB Installation Required
04 CRUD Operations Using MongoDB Required
05 Transactions in MongoDB Required
06 PyMongo Introduction Required
07 Authentication & User Management Required
08 Backup & Restore Required
What is MongoDB ?
Document database.
Stores data in flexible, JSON-like documents.
Fields can vary from document to document.
Data structure can be changed over time.
Document model maps to the objects in your application code.
Ad hoc queries, indexing, and real time aggregation.
High availability, horizontal scaling, and geographic distribution.
Free and open-source, published under the GNU Affero General
Public License.
Install MongoDB
Download the MongoDB Community Server from
https://www.mongodb.com/download-center#community
Double-click the downloaded .msi file.
Click Next for each screen & complete the install
Uncheck Compass in the Selection while installing.
Download and Install Compass Separately
Installs MongoDB in C:\Program Files\MongoDB\Server\3.6\
Setting Up MongoDB
MongoDB requires a data directory to store all data.
Create the folder C:\data\db
To start MongoDB run the exe C:\Program
Files\MongoDB\Server\3.6\bin\mongod.exe from Command
Prompt (Use Administrator prompt)
If MongoDB is successfully installed, It will wait for connections in
port 27017.
Connect to MongoDB
Open Mo goDB Co pass Co u ity from start menu.
Leave the defaults as it is and click Connect.
Setup & Start a Windows Service
Instead of starting from Command Prompt you can install
MongoDB as a Service
Create a folder C:\data\log
Create a file mongod.cfg in C:\Program
Files\MongoDB\Server\3.6\mongod.cfg with below content
systemLog:
destination: file
path: c:\data\log\mongod.log
storage:
dbPath: c:\data\db
Create the above file outside the folder and copy it there.
Run the command "C:\Program
Files\MongoDB\Server\3.6\bin\mongod.exe" --config "C:\Program
Files\MongoDB\Server\3.6\mongod.cfg" –install
Start the MongoDB service from Services panel.
Document Database
A record in MongoDB is a document.
Data structure composed of field and value pairs.
MongoDB documents are similar to JSON objects.
The values of fields may include other documents, arrays, and
arrays of documents.
Advantages of Storing as Documents
Correspond to native data types in many programming languages.
Embedded documents and arrays reduce need for expensive joins.
Dynamic schema supports fluent polymorphism.
MongoDB Databases & Collections
Hold collections of documents.
Data records are stored in collections.
Collections are stored in databases.
Collections are analogous to tables in relational databases.
_id Field
Each document stored in a collection requires a unique _id field
Acts as a primary key
Inserted document omits the _id field, the MongoDB driver
automatically generates an ObjectId for the _id field.
The _id field is always the first field in the documents.
Opening Mongo Shell
Open a Command Prompt
Change directory to C:\Program Files\MongoDB\Server\3.6\bin
Run mongo
Database Example
Create a Database for storing Customer Information
Customer Information consist of
First Name
Last Name
Email
Password
Example JSON
{
FirstName: "Shrinivas"
LastName: "K R"
Email: "shrinivas@lithan.com"
Password: "test"
}
Creating Database & Collections
Create the data ase CRM
Create a Collection Customer to store Customer information in the
form of documents
Run the following command in mongo shell
use CRM
db.createCollection("Customer", { size: 2147483648 } )
The above command creates a customer collection of 2 GB Size
Insert Single Document
Use the db.<collection name>.insert
db.Customer.insert(
{
FirstName: "Raymong",
LastName: "Chang",
Email: "raymond@lithan.com",
Password: "test"
}
)
Insert Multiple Documents
Insert 2 Documents in to the Collection
Use insert many command to insert 2 documents
db.Customer.insertMany(
[
{ FirstName: "Jeyashree", LastName: "Rajkumar", Email: "jeyashree@lithan.com", Password: "test" },
{ FirstName: "Shrinivas", LastName: "K R", Email: "shrinivas@lithan.com", Password: "test" }
]
)
Query All Document
List all documents in a Collection
db.Customer.find( {} )
This is si ilar to sele t * fro tablename i MySQL
Find Document by Condition
Fi d do u e ts here E ail address is shri i as@litha . o
db.Customer.find( { Email: "shrinivas@lithan.com" } )
Update a Document
Update the Password of a Customer with
Email=shrinivas@lithan.com to 123
db.Customer.updateOne(
{ "Email": "shrinivas@lithan.com" },
{
$set: { "Password": "123"}
}
)
It matches the 1st document which matches the email, as it is
updateOne
To replace all document which matches, use updateMany
Replace / Update Document(s)
Replace the entire content of a document except for the _id field
db.Customer.replaceOne(
{ Email: "raymond@lithan.com" },
{ FirstName: "Ramond", LastName: "Tan", Email: "raymond@lithan.com", Password: "test"}
)
Updates the 1st document which matches the condition
Can use updateOne for single document update
Can use updateMany for multiple document update
Delete Document(s)
Delete all documents in the collection
db.Customer.deleteMany({})
Delete a document based on condition
db.Customer.deleteOne({ Email : "raymond@lithan.com" })
Use deleteMany to delete all documents which matches the
condition.
Operators
Query and Projection Operators
Provide ways to locate data within the database.
Projection operators modify how data is presented.
Update Operators
Enable you to modify the data in your database or add additional data.
Aggregation Pipeline Stages
Available aggregation stages for Aggregation Pipeline.
Aggregation Pipeline Operators
Collection of operators available to define and manipulate documents in
pipeline stages.
Query Modifiers
Query modifiers determine the way that queries will be executed.
Refer to MongoDB documentation for whole list of Operators
https://docs.mongodb.com/manual/reference/operator/
Comparison Operator
Operator Description
$eq Equal to a Specified Value
$gt, $gte Greater than & Greater than & Equal to
$lt, $lte Less than & Less than & Equal to
$ne Not equal to
$in, $nin Matched specified values & Not matches specified values
Example :
db.inventory.find( { qty: { $eq: 20 } } )
db.inventory.find( { qty: { $gt: 20 } } )
Logical Operator
Operator Description
$and Joins query clauses with a logical AND
$not Inverts the effect of a query expression
$nor Joins query clauses with a logical NOR
$or Joins query clauses with a logical OR
Example :
db.inventory.find( { $and: [ { price: { $ne: 1.99 } }, { price: { $exists: true } } ] } )
db.inventory.find( {
$and : [
{ $or : [ { price : 0.99 }, { price : 1.99 } ] },
{ $or : [ { sale : true }, { qty : { $lt : 20 } } ] }
]
})
Field Update Operators
Operator Description
$currentDate Sets to current date
$inc Increments the value of the field by the specified amount.
$min Updates the field if the specified value is less than the existing field
value.
$max Updates the field if the specified value is greater than the existing
field value.
$mul Multiplies the value of the field by the specified amount.
$rename Renames a field.
$set Sets the value of a field in a document.
Example :
db.users.update(
{ _id: 1 },
{
$currentDate: {lastModified: true}
}
)
Aggregation Pipeline Stages
$sort
Sorts all input documents and returns them to the pipeline in sorted order.
db.Customer.aggregate(
[
{ $sort : { Email : -1} }
]
)
$limit
Limits the number of documents passed to the next stage in the pipeline.
db.Customer.aggregate(
{ $limit : 1 }
);
Aggregation Pipeline Stages
$project
Passes along the documents with the requested fields to the next stage in the
pipeline.
Sample Document
{
"_id" : 1,
title: "abc123",
isbn: "0001122223334",
author: { last: "zzz", first: "aaa" },
copies: 5
}
Projection Operation
db.books.aggregate( [ { $project : { title : 1 , author : 1 } } ] )
Output
{ "_id" : 1, "title" : "abc123", "author" : { "last" : "zzz", "first" : "aaa" } }
Aggregation Pipeline Operators
$slice
Returns a subset of an array.
{ $slice: [ [ 1, 2, 3 ], 1, 1 ] } [ 2 ]
Sample Document
{ "_id" : 1, "name" : "dave123", favorites: [ "chocolate", "cake", "butter", "apples" ] }
{ "_id" : 2, "name" : "li", favorites: [ "apples", "pudding", "pie" ] }
{ "_id" : 3, "name" : "ahn", favorites: [ "pears", "pecans", "chocolate", "cherries" ] }
{ "_id" : 4, "name" : "ty", favorites: [ "ice cream" ] }
Slice Operation
db.users.aggregate([
{ $project: { name: 1, threeFavorites: { $slice: [ "$favorites", 3 ] } } }
])
Output
{ "_id" : 1, "name" : "dave123", "threeFavorites" : [ "chocolate", "cake", "butter" ] }
{ "_id" : 2, "name" : "li", "threeFavorites" : [ "apples", "pudding", "pie" ] }
{ "_id" : 3, "name" : "ahn", "threeFavorites" : [ "pears", "pecans", "chocolate" ] }
{ "_id" : 4, "name" : "ty", "threeFavorites" : [ "ice cream" ] }
Atomicity & Transactions
A write operation is atomic on the level of a single document.
Even when modifying embedded documents.
Multi document operations are not atomic.
Scenario
Transfer money from one account A to another B
Subtract the funds from A
Add the funds to B
In relational database, it can be done in one query but in MongoDB it
can create inconsistency.
Solution
A collection named accounts to store account information.
A collection named transactions to store information on the fund
transfer transactions.
Transactions Example
Create Accounts
db.accounts.insert(
[
{ _id: "A", balance: 1000, pendingTransactions: [] },
{ _id: "B", balance: 1000, pendingTransactions: [] }
]
)
Create Transactions
Initialize the transactions collection
db.transactions.insert(
{ _id: 1, source: "A", destination: "B", value: 100, state: "initial",
lastModified: new Date() }
)
Transactions Example
Step 1 – Get a Transaction
Get a transaction with initial state pending
var t = db.transactions.findOne( { state: "initial" } )
Step 2 – Set State
Set the state to pending
db.transactions.update(
{ _id: t._id, state: "initial" },
{
$set: { state: "pending" },
$currentDate: { lastModified: true }
}
)
Transactions Example
Step 3 - Apply Transactions
db.accounts.update(
{ _id: t.source, pendingTransactions: { $ne: t._id } },
{ $inc: { balance: -t.value }, $push: { pendingTransactions: t._id } }
)
db.accounts.update(
{ _id: t.destination, pendingTransactions: { $ne: t._id } },
{ $inc: { balance: t.value }, $push: { pendingTransactions: t._id } }
)
Transaction Example
Step 4 – Update Transaction
Update transaction state to applied
db.transactions.update(
{ _id: t._id, state: "pending" },
{
$set: { state: "applied" },
$currentDate: { lastModified: true }
}
)
Step 5 - Update Pending Transactions in Both Accounts
db.accounts.update(
{ _id: t.source, pendingTransactions: t._id },
{ $pull: { pendingTransactions: t._id } }
)
db.accounts.update(
{ _id: t.destination, pendingTransactions: t._id },
{ $pull: { pendingTransactions: t._id } }
)
Transactions Example
Step 6 – Update Transaction State
Update transaction state to done
db.transactions.update(
{ _id: t._id, state: "applied" },
{
$set: { state: "done" },
$currentDate: { lastModified: true }
}
)
Recovering from Failure
Two Phase commit pattern helps to recover transactions.
Run the recovery operation at Startup or in between time
intervals.
Transactions in Pending State
Find transactions in pending state
If a transaction was not completed in 30 minutes, it need to be recovered.
var dateThreshold = new Date();
dateThreshold.setMinutes(dateThreshold.getMinutes() - 30);
var t = db.transactions.findOne( { state: "pending", lastModified: { $lt: dateThreshold
} } );
Continue the transaction from Step 3
Recovering Transactions
Transactions in Applied State
Find the transaction in applied state
var dateThreshold = new Date();
dateThreshold.setMinutes(dateThreshold.getMinutes() - 30);
var t = db.transactions.findOne( { state: "applied", lastModified: { $lt: dateThreshold } } );
Continue from Step 5
Rolling Back Transactions
After the "Update transaction state to pending." step
But before the "Update transaction state to applied." step
You can rollback the transaction
Step 1 – Update Transaction state from pending to canceling
db.transactions.update(
{ _id: t._id, state: "pending" },
{
$set: { state: "canceling" },
$currentDate: { lastModified: true }
}
)
Rolling Back Transactions
Step 2 - Undo the transactions in both accounts
db.accounts.update(
{ _id: t.destination, pendingTransactions: t._id },
{
$inc: { balance: -t.value },
$pull: { pendingTransactions: t._id }
}
)
db.accounts.update(
{ _id: t.source, pendingTransactions: t._id },
{
$inc: { balance: t.value},
$pull: { pendingTransactions: t._id }
}
)
Rolling Back Transactions
Step 3 - Update the transaction state from canceling to cancelled.
db.transactions.update(
{ _id: t._id, state: "canceling" },
{
$set: { state: "cancelled" },
$currentDate: { lastModified: true }
}
)
PyMongo - Python with MongoDB
PyMongo is a client to connect and access features of MongoDB
from Python
Run pip to install
python -m pip install pymongo
Connecting to a Database & Get Collection
Sample Code
from pymongo import MongoClient
client = MongoClient('mongodb://localhost:27017/')
db = client['CRM']
customers = db['Customer']
from pymongo import MongoClient
Import the Library
client = MongoClient('mongodb://localhost:27017/')
Create a Mongo Client
db = client['CRM']
Get the CRM database
customers = db['Customer']
Get the Customer Collection
Inserting a Document
Data in MongoDB is represented (and stored) using JSON-style
documents.
PyMongo use dictionaries to represent documents.
Sample Code
from pymongo import MongoClient
client = MongoClient('mongodb://localhost:27017/')
db = client['CRM']
customers = db['Customer']
customer = {
"FirstName": "From",
"LastName": "PyMongo",
"Email": "pymongo@lithan.com",
"Password": "test"
}
customer_id = customers.insert_one(customer).inserted_id
customers.insert_one(customer).inserted_id
Insert_one inserts the customer and returns the id of the created record
Printing One & All Customers
Sample Code (printall.py)
from pymongo import MongoClient
import pprint
client = MongoClient('mongodb://localhost:27017/')
db = client['CRM']
customers = db['Customer']
pprint.pprint(customers.find_one())
customers.find_one()
Returns 1 Record, which is printed by pprint.pprint
Printing all customer documents
for customer in customers.find():
pprint.pprint(customer)
Counting Records
print(customers.count())
Updating a Document
Sample Code
from pymongo import MongoClient
import pprint
client = MongoClient('mongodb://localhost:27017/')
db = client['CRM']
customers = db['Customer']
customers.update_one(
{"Email": "shrinivas@lithan.com"},
{"$set": {"Password": "password"}})
Use update_one to update the Password field based on the
condition
You can use the following methods
update()
update_many()
replace_one()
Deleting a Document
Sample Code
from pymongo import MongoClient
import pprint
client = MongoClient('mongodb://localhost:27017/')
db = client['CRM']
customers = db['Customer']
customers.delete_one({"Email": "pymongo@lithan.com"})
Use delete_one method to delete the 1st record which matches
the condition
You can also use delete_many to delete multiple records
Indexes
Support the efficient execution of queries in MongoDB.
Without indexes, MongoDB must perform a collection scan.
Indexes are special data structures
Stores the value of a specific field or set of fields, ordered by the
value of the field.
Creates a unique index on the _id field during the creation of a
collection.
MongoDB supports single or Multi Key Index
Sample Code
db.collection.createIndex( { name: -1 } )
Creates an Index on name field with descending order
Enabling Authentication
By default anyone can connect to a MongoDB server with host
name and port
Enabling access control on a MongoDB deployment enforces
authentication.
Users can only perform actions as determined by their roles.
Connect to MongoDB server using mongo shell
use admin
db.createUser(
{
user: "admin",
pwd: "abc123",
roles: [ { role: "userAdminAnyDatabase", db: "admin" } ]
}
)
Create a User for CRM Database
In the mongo shell create a user with your name and a password
With Read & Write Access
use admin
db.createUser(
{
user: "shrini",
pwd: "test123",
roles: [ { role: "readWrite", db: "CRM" }]
}
)
Updating Config File
When using a Windows Service, update the file C:\Program
Files\MongoDB\Server\3.6\mongod.cfg as below
systemLog:
destination: file
path: c:\data\log\mongod.log
storage:
dbPath: c:\data\db
security:
authorization: enabled
Restart the server from Services for authorization to take effect
If starting MongoDB server from command line, use the below
mongod --auth --port 27017 --dbpath /data/db
Connecting using Authentication
To Start Mongo Shell with authentication, use
mongo --port 27017 -u "shrini" -p "test123" --authenticationDatabase "admin"
For Python use the below URI
client = MongoClient('mongodb://shrini:test123@localhost:27017')
Backup & Restore
Develop a Strategy to Backup & Restore in case of Hardware &
Software Failure
Use mongodump & mongorestore in C:\Program
Files\MongoDB\Server\3.6\bin
Command to backup
mongodump /authenticationDatabase:admin /username:shrini
/password:test123 /db:CRM /out:/data/1
Command to restore from backup