KEMBAR78
Mongo db tutorials | PPT
An Introduction to
MongoDB
Anuj Jain
Equal Experts India
MongoDB
NoSQL

Key-value

Graph database

Document-oriented

Column family
The Great Divide
Not a RDBMS
• Mongo is not a relational database like MySQL, Oracle
• No transactions.
• No referential Integrity.
• No Joins.
• No schema, so no columns or rows.
• NoSQL.
What is MongoDB ?
• Scalable High-Performance Open-source,
Document-orientated database.
• Built for Speed
• Rich Document based queries for Easy readability.
• Full Index Support for High Performance.
• Replication and Failover for High Availability.
• Auto Sharding for Easy Scalability.
• Map / Reduce for Aggregation
Quiz ?
Which of the following statement are true about MongoDB ?
1. MongoDB is document oriented.
2. MongoDB supports Joins.
3. MongoDB has dynamic schema.
4. MongoDB supports SQL.
What MongoDB is great for ?
• Semi structured Content Management.
• Real time Analytics and High-Speed Logging.
• Caching and High Availability.
• Mobile and Social Infrastructure, Big Data etc.
Some considerations while designing
schema in MongoDB
• Combine objects into one document if you will use them together.
Otherwise separate them (but make sure there should not be
need of joins).
• Duplicate the data (but limited) because disk space is cheap as
compare to compute time.
• Do joins while write, not on read.
• Optimize your schema for most frequent use cases.
• Do complex aggregation in the schema.
Example – Blog Post
• Every post has the unique title, description and url.
• Every post can have one or more tags.
• Every post has the name of its publisher and total number of
likes.
• Every Post have comments given by users along with their name,
message, data-time and likes.
• On each post there can be zero or more comments.
RDBMS
Mongo Schema
{
“_id” : ObjectId("55b1f50899708bec87f96edc")
“title” : “MongoDB Tutorial for beginner”,
“description: “How to start using mongodb”,
“by: Anuj Jain,
“url: “http://mongodbtutorial.com/blog/mongodb”,
“tags” : ['mongodb', 'nosql' ],
“likes” : 200,
“comments” : [
{
“user” : ''MongoUser”,
“message” : “Very Nice Tutorial” ,
“dateCreated” : NumberLong(1437725960469),
“like” : true
}
]
}
Quiz ?
How many different data types are there in JSON ?
1. 4
2. 5
3. 6
4. 7
Answer
Ans: 6
1. String
2. Number
3. Boolean
4. null
5. Array
6. Object/document
CRUD
Create
db.collection.insert( <document> )
db.collection.save( <document> )
db.collection.update( <query>, <update>, { upsert: true } )
Read
db.collection.find( <query>, <projection> )
db.collection.findOne( <query>, <projection> )
Update
db.collection.update( <query>, <update>, <options> )
Delete
db.collection.remove( <query>, <justOne> )
Some Other Operators
1. $and
2. $or
3. $in
4. $nin
5. $exists
6. $push
7. $pop
8. $addToSet
Indexes
1.Indexes are special data structures, that store a small portion of
the data set in an easy to traverse form.
2. Stores the value of a specific field or set of fields.
3. Ordered by the value of the field as specified in index.
4. Indexes can improves read operation but slower the write
operations.
5. Mongodb use B-Tree data structure to store indexes.
6.Blocks mongod process unless you specify the background.
7. null is assumed if field is missing.
When To Index ?
1. Frequently Queried Fields
2. Low response time
3. Sorting
4. Avoid full collection scans.
Indexes Types
1. Default (_id)
2. Single Field
3. Compound Index
4. Multikey Index
5. Geospatial Index
6. Sparse Index
7. TTL Index
Quiz ?
Mongodb index can have keys of different types
( ints, dates, string for example) in it ?
1. True
2. False
Covered Indexes
Queries that can be resolved with only the index (does not need to
fetch the original document)
Example: { “name”:”Anuj”,
– “age”:28,
– “gender”:Male,
– “skills”:[“Java”,”Mongo”]
}
db.people.ensureIndex({“name”:1,”age”:1})
db.people.find ({“name”:”Anuj”},{“_id” :0 , “age”:1})
TTL Indexes {“time to live”}
1. Mongod already remove the data from the collections after specify
number of seconds.
2. Field type should either be BSON date type or an array of BSON date-
type object
Eg. db.log_events.createIndex( { "createdAt": 1 }, { expireAfterSeconds:
3600 } )
Where createdAt is date field
Quiz ?
Suppose we run :
db.foo.ensureIndex ({a:1, b:2, c:3})
db.foo.find({a : “sports”, b:{$gt : 100}})
Then
1.Only the index needs touched to fully execute the
query.
2.Then index and some documents need to be executed.
Why Replication?
• To keep your data safe.
• High (24*7) availability of data.
• Disaster Recovery.
• Read scaling (extra copies to read from).
• No downtime for maintenance (like backups, index
rebuilds, compaction).
Replica set features
• A class of N nodes.
• Anyone node can be primary.
• All write operation goes to primary.
• Automatic fail-overs.
• Automatic Recovery.
• Consensus election of primary.
Capped Collection
• Fixed size circular queues that follow the insertion order.
• Fixed size is preallocated and when it exhausted, oldest
document will automatically start getting deleted.
• We cannot delete documents from a capped collection.
• There are no default indexes present in a capped collection, not
even on _id field.
Capped Collection
Commands :
1. db.createCollection (
"cappedcollection", {capped:true,size:10000} )
2. db.createCollection (
"cappedcollection", capped:true, size:10000, max:1000 })
3. db.cappedLogCollection.isCapped()
4. db.runCommand({"convertToCapped":"posts",size:10000})
5. db.cappedLogCollection.find().sort({$natural:-1})
Query Limitations:
Indexing can't be used in queries which use:
1. Regular expressions or negation operators like $nin, $not, etc.
Arithmetic operators like $mod, etc.
2. $where clause
Hence, it is always advisable to check the index usage for your
queries.
Index Limitation
Maximum Ranges:
• A collection cannot have more than 64 indexes.
• The length of the index name cannot be longer than 125
characters.
• A compound index can have maximum 31 fields indexed
$explain
The $explain operator provides information and statistics on the
query for example :
1. Indexes used the query.
2. Number of document scan in serving the query.
3. Whether index enough to serve the query data i.e. covered
Index.
Usage :
db.users.find({gender:"M"}, {user_name:1,_id:0} ).explain()
$hint
The $hint operator forces the query optimizer to use the specified
index to run a query
db.users.find({gender:"M"},{user_name:1,_id:0})
» .hint({gender:1,user_name:1})
Backup & Restore
Backup Utilities
1. mongodump (use to dump complete data directory or db)
2. mongoexport (use to dump certain collection to output file like
json or csv).
Restore Utilities
1. mongorestore
2. mongoimport
ObjectId
An ObjectId is a 12-byte BSON type having the following
structure:
The first 4 bytes representing the seconds since the unix epoch
The next 3 bytes are the machine identifier
The next 2 bytes consists of process id
The last 3 bytes are a random counter value
Thank you
References:
https://www.mongodb.org/

Mongo db tutorials

  • 1.
    An Introduction to MongoDB AnujJain Equal Experts India MongoDB
  • 2.
  • 3.
  • 4.
    Not a RDBMS •Mongo is not a relational database like MySQL, Oracle • No transactions. • No referential Integrity. • No Joins. • No schema, so no columns or rows. • NoSQL.
  • 5.
    What is MongoDB? • Scalable High-Performance Open-source, Document-orientated database. • Built for Speed • Rich Document based queries for Easy readability. • Full Index Support for High Performance. • Replication and Failover for High Availability. • Auto Sharding for Easy Scalability. • Map / Reduce for Aggregation
  • 6.
    Quiz ? Which ofthe following statement are true about MongoDB ? 1. MongoDB is document oriented. 2. MongoDB supports Joins. 3. MongoDB has dynamic schema. 4. MongoDB supports SQL.
  • 7.
    What MongoDB isgreat for ? • Semi structured Content Management. • Real time Analytics and High-Speed Logging. • Caching and High Availability. • Mobile and Social Infrastructure, Big Data etc.
  • 8.
    Some considerations whiledesigning schema in MongoDB • Combine objects into one document if you will use them together. Otherwise separate them (but make sure there should not be need of joins). • Duplicate the data (but limited) because disk space is cheap as compare to compute time. • Do joins while write, not on read. • Optimize your schema for most frequent use cases. • Do complex aggregation in the schema.
  • 9.
    Example – BlogPost • Every post has the unique title, description and url. • Every post can have one or more tags. • Every post has the name of its publisher and total number of likes. • Every Post have comments given by users along with their name, message, data-time and likes. • On each post there can be zero or more comments.
  • 10.
  • 11.
    Mongo Schema { “_id” :ObjectId("55b1f50899708bec87f96edc") “title” : “MongoDB Tutorial for beginner”, “description: “How to start using mongodb”, “by: Anuj Jain, “url: “http://mongodbtutorial.com/blog/mongodb”, “tags” : ['mongodb', 'nosql' ], “likes” : 200, “comments” : [ { “user” : ''MongoUser”, “message” : “Very Nice Tutorial” , “dateCreated” : NumberLong(1437725960469), “like” : true } ] }
  • 12.
    Quiz ? How manydifferent data types are there in JSON ? 1. 4 2. 5 3. 6 4. 7
  • 13.
    Answer Ans: 6 1. String 2.Number 3. Boolean 4. null 5. Array 6. Object/document
  • 14.
    CRUD Create db.collection.insert( <document> ) db.collection.save(<document> ) db.collection.update( <query>, <update>, { upsert: true } ) Read db.collection.find( <query>, <projection> ) db.collection.findOne( <query>, <projection> ) Update db.collection.update( <query>, <update>, <options> ) Delete db.collection.remove( <query>, <justOne> )
  • 15.
    Some Other Operators 1.$and 2. $or 3. $in 4. $nin 5. $exists 6. $push 7. $pop 8. $addToSet
  • 16.
    Indexes 1.Indexes are specialdata structures, that store a small portion of the data set in an easy to traverse form. 2. Stores the value of a specific field or set of fields. 3. Ordered by the value of the field as specified in index. 4. Indexes can improves read operation but slower the write operations. 5. Mongodb use B-Tree data structure to store indexes. 6.Blocks mongod process unless you specify the background. 7. null is assumed if field is missing.
  • 17.
    When To Index? 1. Frequently Queried Fields 2. Low response time 3. Sorting 4. Avoid full collection scans.
  • 18.
    Indexes Types 1. Default(_id) 2. Single Field 3. Compound Index 4. Multikey Index 5. Geospatial Index 6. Sparse Index 7. TTL Index
  • 19.
    Quiz ? Mongodb indexcan have keys of different types ( ints, dates, string for example) in it ? 1. True 2. False
  • 20.
    Covered Indexes Queries thatcan be resolved with only the index (does not need to fetch the original document) Example: { “name”:”Anuj”, – “age”:28, – “gender”:Male, – “skills”:[“Java”,”Mongo”] } db.people.ensureIndex({“name”:1,”age”:1}) db.people.find ({“name”:”Anuj”},{“_id” :0 , “age”:1})
  • 21.
    TTL Indexes {“timeto live”} 1. Mongod already remove the data from the collections after specify number of seconds. 2. Field type should either be BSON date type or an array of BSON date- type object Eg. db.log_events.createIndex( { "createdAt": 1 }, { expireAfterSeconds: 3600 } ) Where createdAt is date field
  • 22.
    Quiz ? Suppose werun : db.foo.ensureIndex ({a:1, b:2, c:3}) db.foo.find({a : “sports”, b:{$gt : 100}}) Then 1.Only the index needs touched to fully execute the query. 2.Then index and some documents need to be executed.
  • 23.
    Why Replication? • Tokeep your data safe. • High (24*7) availability of data. • Disaster Recovery. • Read scaling (extra copies to read from). • No downtime for maintenance (like backups, index rebuilds, compaction).
  • 24.
    Replica set features •A class of N nodes. • Anyone node can be primary. • All write operation goes to primary. • Automatic fail-overs. • Automatic Recovery. • Consensus election of primary.
  • 25.
    Capped Collection • Fixedsize circular queues that follow the insertion order. • Fixed size is preallocated and when it exhausted, oldest document will automatically start getting deleted. • We cannot delete documents from a capped collection. • There are no default indexes present in a capped collection, not even on _id field.
  • 26.
    Capped Collection Commands : 1.db.createCollection ( "cappedcollection", {capped:true,size:10000} ) 2. db.createCollection ( "cappedcollection", capped:true, size:10000, max:1000 }) 3. db.cappedLogCollection.isCapped() 4. db.runCommand({"convertToCapped":"posts",size:10000}) 5. db.cappedLogCollection.find().sort({$natural:-1})
  • 27.
    Query Limitations: Indexing can'tbe used in queries which use: 1. Regular expressions or negation operators like $nin, $not, etc. Arithmetic operators like $mod, etc. 2. $where clause Hence, it is always advisable to check the index usage for your queries.
  • 28.
    Index Limitation Maximum Ranges: •A collection cannot have more than 64 indexes. • The length of the index name cannot be longer than 125 characters. • A compound index can have maximum 31 fields indexed
  • 29.
    $explain The $explain operatorprovides information and statistics on the query for example : 1. Indexes used the query. 2. Number of document scan in serving the query. 3. Whether index enough to serve the query data i.e. covered Index. Usage : db.users.find({gender:"M"}, {user_name:1,_id:0} ).explain()
  • 30.
    $hint The $hint operatorforces the query optimizer to use the specified index to run a query db.users.find({gender:"M"},{user_name:1,_id:0}) » .hint({gender:1,user_name:1})
  • 31.
    Backup & Restore BackupUtilities 1. mongodump (use to dump complete data directory or db) 2. mongoexport (use to dump certain collection to output file like json or csv). Restore Utilities 1. mongorestore 2. mongoimport
  • 32.
    ObjectId An ObjectId isa 12-byte BSON type having the following structure: The first 4 bytes representing the seconds since the unix epoch The next 3 bytes are the machine identifier The next 2 bytes consists of process id The last 3 bytes are a random counter value
  • 33.