An introduction to MongoDB
Sathish Ravikumar
SQL vs NoSQL
Agenda Introduction of MongoDB
MongoDB Features
Replication/ High Availability
Sharding/ Scaling
SQL vs NoSQL
NoSQL (often interpreted as Not only SQL) database
It provides a mechanism for storage and retrieval of data that is
modeled in means other than the tabular relations used in relational
databases.
SQL NoSQL
Relational Database Management System (RDBMS) Non-relational or distributed database system.
These databases have fixed or static or predefined schema They have dynamic schema
These databases are best suited for complex queries These databases are not so good for complex queries
Vertically Scalable Horizontally scalable
Follows ACID property Follows BASE property
SQL vs NoSQL
NoSQL Types
Graph database
Document-oriented
Column family
What is MongoDB?
MongoDB is an open source, document-oriented database designed with both
scalability and developer agility in mind.
Instead of storing your data in tables and rows as you would with a relational database,
in MongoDB you store JSON-like documents with dynamic schemas(schema-free,
schema less).
{
"_id" : ObjectId("5114e0bd42…"),
“FirstName" : "John",
“LastName" : "Doe",
“Age" : 39,
“Interests" : [ "Reading", "Mountain Biking ]
“Favorites": {
"color": "Blue",
"sport": "Soccer“
}
}
MongoDB is Easy to Use
Scheme Free RDBMS vs MongoDB
MongoDB does not need any pre-defined data schema
Every document could have different data!
RDBMS MongoDB
{name: “will”, {name: “jeff”, {name: “brendan”, Database Database
eyes: “blue”, eyes: “blue”, boss: “will”}
birthplace: “NY”, loc: [40.7, 73.4],
Table Collection
aliases: [“bill”, “ben”], boss: “ben”}
loc: [32.7, 63.4], Row Document (JSON, BSON)
boss: ”ben”}
{name: “matt”, Column Field
weight:60, Index Index
height: 72,
{name: “ben”, loc: [44.6, 71.3]} Join Embedded Document
age:25}
Partition Shard
Features Of MongoDB
• Document-Oriented storege
• Full Index Support
• Replication & High Availability
• Auto-Sharding
• Aggregation
• MongoDB Atlas
• Various APIs
• JavaScript, Python, Ruby, Perl, Java, Java, Scala, C#, C+
+, Haskell, Erlang
• Community
Replication
• Replication provides redundancy and increases data
availability.
• With multiple copies of data on different database servers,
replication provides a level of fault tolerance against the loss of
a single database server.
Copy of database Copy of database
Replication
Sharding
• Sharding is a method for
distributing data across multiple
machines.
• MongoDB uses sharding to
support deployments with very
large data sets and high
throughput operations.
Sharding Architecture
• Shard is a Mongo instance to
handle a subset of original data.
• Mongos is a query router to shards.
• Config Server is a Mongo instance
which stores metadata information
and configuration details of cluster.
Sharding/Replication
• Replication Split data sets across
multiple data nodes for high
availability.
• Sharding scale up/down
horizontally when it is required for
high throughput