0% found this document useful (0 votes)

3 views36 pages

2-Spring24 NoSQL Systems

Uploaded by

sabakabha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views36 pages

2-Spring24 NoSQL Systems

Uploaded by

sabakabha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

NoSQL Systems

RDBMS Databases

● good for handling transactional workloads involving small amounts of

data with random read/write properties.
● are ACID-compliant, atomicity, consistency, isolation, and durability.
○ they are generally restricted to a single node.
○ do not provide out-of-the-box redundancy and fault tolerance.
● To handle large volumes of data RDBMSs employ vertical scaling which
is a more costly
○ RDBMSs less than ideal for long-term storage of data that accumulates over time
RDBMS Databases
● Relational databases need to be
manually sharded, mostly using
application logic.
○ This means that the
application logic needs
to know which shard to
query in order to get the
required data.
○ This further complicates
data processing when
data from multiple
shards is required.
RDBMS Databases
● the use of the application logic
to join data retrieved from
multiple shards
RDBMS Databases

● Relational databases generally require data to adhere to a schema.

○ semi-structured and unstructured data not directly supported.
● traditional RDBMS is generally not useful as the primary storage device
in a Big Data solution environment.
Types of NoSQL Systems

1 Key-value Database

2. Document-oriented Database

3. Column-oriented Database

4. Graph Database
Key-value Database

● One of the simplest NoSQL databases.

● Data is represented as a collection of <key,value> pairs.
● It works by storing buckets of <key,value> pairs in a logical way in which all
relevant data relating to an item are stored within that item.
● A key can have a dynamic set of attributes attached to it. fast response time
● ability to store an enormous number of records with extremely low-latency
● provides all the maintenance and failover services
● Some examples of this type of databases are Redis, Riak, Amazon
DynamoDB, and Voldemort .
DOCUMENT—ORIENTED DATABASE

● A document-oriented database extends the concept of a key-value

database by employing ﬂexible data structures
● Store records as “documents”
● support nested and complex structure documents to deﬁne subcategories
of information.
● he data values in a key-value database are opaque to the store, whereas
the data values in a document-oriented database are transparent to the
store
DOCUMENT—ORIENTED DATABASE
DOCUMENT—ORIENTED DATABASE

● Strengths
○ Cost of scaling out compared to a SQL database.
○ Can index the ﬁelds of documents which allows the user to query not only by the primary
key but also by a document’s contents.
○ Schemaless, completely free to deﬁne the contents of a document.
● Limitations
○ Generally not suitable for business transaction application.
○ does not offer any referential integrity support.
○ does not offer joins across collections.
MongoDB

● Data Representation:
○ MongoDB is a document-style database.
○ Document is analogous to the concept of row in RDBMS.
○ In MongoDB, a Collection is a group of documents. This is analogous to a table in RDBMS
○ Documents in MongoDB are stored in JavaScript Object Notation (JSON) format
● Indexing and Sharding
○ Documents are indexed according to keywords for faster access and retrieval.
○ sharding (or index sharding) is the process of splitting a database across multiple
machines.
○ MongoDB incorporates auto-sharding, through which a MongoDB cluster can split data
and re-balance automatically.
MongoDB

● Automatic sharding beneﬁts:

○ Automatic balancing of data.
○ Scaling out with minimal down time, i.e., new hosts can be added.
○ Replication to avoid single point of failure.
●
MongoDB

● A shard consists of one or more servers that contains the subset of data
that it is responsible for.
● If there are more than one servers in a shard then a shard may also contain
replicated data.
○ If there are more than one servers in a shard then a shard may also contain replicated
data.
●
Example
Mongo DB + Python

! python -m pip install pymongo==3.7.2

###########
import pymongo
from pymongo import MongoClient
client = MongoClient()
#######
Mongo DB + Python

#create db1
mydb =client[ "db1"]
#create collection

mydb.create_collection( 'addressbook ')

# Set the collection to work with

collection = mydb. addressbook
# Insert one item to create the collection

collection.insert_one({ 'name' : 'Ali'})

# Show the existing collections
list (collection.find())
Mongo DB + Python

#insert
data = { 'name' : "Ali" , # String
'age' : 25, # Integer
'gender' : "M", # String
'address': {
'street' : "ahmad tarawnwh" , # String
'number' : 77, # Integer
'city' : "AMMAN", # String
'floor' : None, # Null
'postalcode' : "11910", # String containing a
number
},
'favouriteFruits' : ['banana','pineapple' ,'orange'] # Array
}
collection.insert_one( data)
Mongo DB + Python

list ( collection.find() )
list ( collection.find( {'name' : "Ali" } ))
#Projection : selecting only some fields
list ( collection.find( {},{'name' : 1,'age':1 } ))
#Projection : avoiding some fields
list ( collection.find( {},{'name' : 0,'age':0 } ))
#Projection : selecting only some fields and avoid the id
list ( collection.find( {},{'name' : 1,'address.city':1,'_id':0 } ))
#Projection : selecting only some fields
list ( collection.find( {},{'name' : 1,'address.city':1,'_id':0 } ))
Comparison Query Operators

Source
Comparison Query Operators
#Example comparison operators
list ( collection.find( {'age' : {'$lt':30}} ))

list ( collection.find( {'age' : {'$lt':30}}, {'name' : 1,'age':1,'_id':0 } ))

list ( collection.find( {'age' : {'$gte':25}}, {'name' : 1,'age':1,'_id':0 } ))
#$in operator
list ( collection.find( {'age' : {'$in':[20,30]}}, {'name' : 1,'age':1,'_id':0 } ))
#$nin operator
list ( collection.find( {'age' : {'$nin':[20,30]}}, {'name' : 1,'age':1,'_id':0 } ))
Logical Query Operators

Source
Logical Query Operators
list ( collection.find( {
'$and':[ { 'name':"Ali"}, {'age' : {'$lt':30} } ]},
{'name' : 1,'age':1,'_id':0 } ))
list ( collection.find( {
'$and':[ { 'age':{'$gt':15} }, {'age' : {'$lt':30} } ]},
{'name' : 1,'age':1,'_id':0 } ))
list ( collection.find( {
'age':{'$gt':15,'$lt':30} } ,
{'name' : 1,'age':1,'_id':0 } ))

Source
Sorting
list ( collection.find( {} ,
{'name' : 1,'age':1,'_id':0 }
).sort('age',-1) )

list ( collection.find( {} ,
{'name' : 1,'age':1,'_id':0 }
).sort( [('name',pymongo.ASCENDING),('age',pymongo.DESCENDING) ] ) )

.sort([('name', 1), ('age', -1)])

Aggregation Operations

● You can use aggregation operations to:

○ Group values from multiple documents together.
○ Perform operations on the grouped data to return a single result.
○ Analyze data changes over time.
● We can use
○ Aggregation pipelines
○ Single purpose aggregation methods
Aggregation Pipeline

A pipeline consists of one or more stages that process documents

Sample operation on each stage
● $project – select ﬁelds for the output documents.
● $match – select documents to be processed.
● $sort – sort documents.
● $group – group documents by a speciﬁed key.
….
Example

mydb.create_collection( 'stdinfo')
std_collection=mydb.stdinfo
data =[
{'name':'ali','gpa':90,'prog':"CS"},
{'name':'zaid','gpa':88, 'prog':"DS"},
{'name':'ahmed','gpa':70,'prog':"SE"},
{'name':'maryam','gpa':68,'prog':"SE"},
{'name':'fatema','gpa':87,'prog':"DS"},
{'name':'kareem','gpa':77,'prog':"CS"}
]

std_collection.insert_many(data)
list(std_collection.find())
list (std_collection.aggregate([
{
'$group': {
'_id': '$prog',
'agvGPA': {'$avg': "$gpa"}
}
}

])
)
list (std_collection.aggregate([
{
'$group': {
'_id': '$prog',
'agvGPA': {'$avg': "$gpa"}
}
},
{
'$sort': {'agvGPA': -1 }
}
])
)
list (std_collection.aggregate([
{
'$group': {
'_id': '$prog',
'agvGPA': {'$avg': "$gpa"}
}
},
{
'$match': { 'agvGPA': {'$gt': 70} }
},
{
'$sort': { 'agvGPA': -1 }
}
])
)
list (std_collection.aggregate([
{
'$match': { 'prog':{'$in':['CS','SE']}
}
},
{
'$group': {
'_id': '$prog',
'agvGPA': {'$avg': "$gpa"}
}
},
{'$sort': {'agvGPA': -1 } }
])
)
COLUMN-ORIENTED DATABASE
A column-oriented database stores its content by column as opposed to by row
and
serializes all of the values of a column together. A columnar database aims to
efﬁciently
retrieve or write data from hard disk storage in order to speed up the time it
takes to return
a query.
Strengths

● High data. compression and help storage capacity to be used more

efficiently
● Can achieve high query performance on aggregation queries such as AVG.
SUM. MAX. MIN. and COUNT
● more efficient for inserting a single column values at once as this can be
written efficienttly without affecting any other columns for the rows.
● The quick searching, scanning and aggregation abilities of column
oriented database storage are higlily efficient for analytics
GRAPH DATABASE

Big Data (Unit 3)
No ratings yet
Big Data (Unit 3)
46 pages
Big Data Notes
No ratings yet
Big Data Notes
13 pages
Experiment No. 3 Mongodb
No ratings yet
Experiment No. 3 Mongodb
8 pages
MongoDB Guide for Developers
No ratings yet
MongoDB Guide for Developers
24 pages
L48 - MongoDB
No ratings yet
L48 - MongoDB
31 pages
Complete Unit 3 Notes
No ratings yet
Complete Unit 3 Notes
30 pages
UNIT-IV MongoDB
No ratings yet
UNIT-IV MongoDB
54 pages
Document Database
No ratings yet
Document Database
25 pages
Big Training Data Module 2 - Mongo DB 2
No ratings yet
Big Training Data Module 2 - Mongo DB 2
67 pages
Big Data Practical 3
No ratings yet
Big Data Practical 3
4 pages
02 - Document-Based and MongoDB
No ratings yet
02 - Document-Based and MongoDB
133 pages
Unit 2 - Bda Notes
No ratings yet
Unit 2 - Bda Notes
37 pages
1664473609-Unit 5 - Database Management - MongoDB
No ratings yet
1664473609-Unit 5 - Database Management - MongoDB
23 pages
A. Im, G. Cai, H. Tunc, J. Stevens, Y. Barve, S. Hei Vanderbilt University
No ratings yet
A. Im, G. Cai, H. Tunc, J. Stevens, Y. Barve, S. Hei Vanderbilt University
81 pages
MongoDB Guide for Students
No ratings yet
MongoDB Guide for Students
104 pages
Dbms Unit5 Notes
No ratings yet
Dbms Unit5 Notes
81 pages
Mongo DB
No ratings yet
Mongo DB
16 pages
Manual Group B Assignment No 1
No ratings yet
Manual Group B Assignment No 1
7 pages
Updated Mongodb Lab Manual IV Sem
No ratings yet
Updated Mongodb Lab Manual IV Sem
48 pages
Big Data
No ratings yet
Big Data
26 pages
NoSQL+Databases+and+MongoDB+-+I+ +Lecture+Notes
No ratings yet
NoSQL+Databases+and+MongoDB+-+I+ +Lecture+Notes
7 pages
FSD Notes Unit-3-1
No ratings yet
FSD Notes Unit-3-1
26 pages
Module 5
No ratings yet
Module 5
32 pages
Mongo DB
No ratings yet
Mongo DB
77 pages
NOSQL Databases
No ratings yet
NOSQL Databases
8 pages
Chapitre 4 MongoDB
No ratings yet
Chapitre 4 MongoDB
27 pages
Mongodb-Unit 5
No ratings yet
Mongodb-Unit 5
120 pages
MongoDB Case Study 1
No ratings yet
MongoDB Case Study 1
6 pages
Module 3 Mongodb
No ratings yet
Module 3 Mongodb
10 pages
NGT Unit 2 - 230630 - 094118
No ratings yet
NGT Unit 2 - 230630 - 094118
62 pages
Chapter 5
No ratings yet
Chapter 5
25 pages
Mongo DB
No ratings yet
Mongo DB
36 pages
Assignment 16 Utkarsh
No ratings yet
Assignment 16 Utkarsh
8 pages
Mongo DB
No ratings yet
Mongo DB
31 pages
Introduction To MongoDB
No ratings yet
Introduction To MongoDB
52 pages
Module-III MangoDB
No ratings yet
Module-III MangoDB
50 pages
Introduction to NoSQL Databases
No ratings yet
Introduction to NoSQL Databases
14 pages
NoSQL Data Analytics Guide
0% (1)
NoSQL Data Analytics Guide
50 pages
MongoDB NoSQL Database Guide
No ratings yet
MongoDB NoSQL Database Guide
19 pages
MongoDB for Developers
No ratings yet
MongoDB for Developers
15 pages
Mongo
No ratings yet
Mongo
58 pages
mongoDB 1
No ratings yet
mongoDB 1
23 pages
NoSQL 14 MONGO 2
No ratings yet
NoSQL 14 MONGO 2
37 pages
Mongodb Cheat Sheet
No ratings yet
Mongodb Cheat Sheet
10 pages
Wa0005.
No ratings yet
Wa0005.
145 pages
Mongo DB
No ratings yet
Mongo DB
104 pages
Mongo DB
No ratings yet
Mongo DB
6 pages
Big Data-Unit 4
No ratings yet
Big Data-Unit 4
41 pages
Mongo DB
No ratings yet
Mongo DB
104 pages
Unit IV
No ratings yet
Unit IV
50 pages
NoSQL Database Guide
No ratings yet
NoSQL Database Guide
100 pages
Mongodb Schema Validation
No ratings yet
Mongodb Schema Validation
8 pages
MongoDB Database Systems Guide
No ratings yet
MongoDB Database Systems Guide
23 pages
Mongo DB
No ratings yet
Mongo DB
26 pages
Module 3
No ratings yet
Module 3
15 pages
Mongo DB
No ratings yet
Mongo DB
99 pages
MongoDB Guide for Developers
No ratings yet
MongoDB Guide for Developers
26 pages
Journey To The Mongodb: Myat Su Htwe Senior Lecturer Academic Department
No ratings yet
Journey To The Mongodb: Myat Su Htwe Senior Lecturer Academic Department
44 pages
Soul
100% (1)
Soul
26 pages
Industry 4.0 Course Overview
No ratings yet
Industry 4.0 Course Overview
49 pages
WALLIX PEDM Presentation Guide
No ratings yet
WALLIX PEDM Presentation Guide
14 pages
Othman 2020 E R PDF
No ratings yet
Othman 2020 E R PDF
19 pages
Summer Training Report
50% (4)
Summer Training Report
58 pages
Gong, 2021
No ratings yet
Gong, 2021
10 pages
CICD - Move Windows Build To New Container. by Mikeller Pull Request #4059 Subsurface - Subsurface
No ratings yet
CICD - Move Windows Build To New Container. by Mikeller Pull Request #4059 Subsurface - Subsurface
3 pages
Mock Exam 1
No ratings yet
Mock Exam 1
58 pages
Realvce: Free Vce Exam Simulator, Real Exam Dumps File Download
No ratings yet
Realvce: Free Vce Exam Simulator, Real Exam Dumps File Download
4 pages
The Generations of Computer From First To Fifth
No ratings yet
The Generations of Computer From First To Fifth
33 pages
Een Inkijk in Werken Bij CERN - Eveline Sintnicolaas
No ratings yet
Een Inkijk in Werken Bij CERN - Eveline Sintnicolaas
19 pages
IntroductiontoComputerScience PDF
No ratings yet
IntroductiontoComputerScience PDF
4 pages
Kaduna Polytechnic
No ratings yet
Kaduna Polytechnic
3 pages
AUTOMATING PROCESSES IN WEB-INTERFACES WITH ROBOTIC PROCESS AUTOMATION - Jesse Varis
No ratings yet
AUTOMATING PROCESSES IN WEB-INTERFACES WITH ROBOTIC PROCESS AUTOMATION - Jesse Varis
33 pages
Blockchain Technology
No ratings yet
Blockchain Technology
2 pages
Excel 2007 Advanced Report Development PDF
No ratings yet
Excel 2007 Advanced Report Development PDF
2 pages
Veeam Backup 10 0 User Guide Hyperv
No ratings yet
Veeam Backup 10 0 User Guide Hyperv
1,410 pages
Google UX Design Certificate - Portfolio Project 3 - Case Study Slide Deck
No ratings yet
Google UX Design Certificate - Portfolio Project 3 - Case Study Slide Deck
27 pages
Integration of Python With Hadoop and Spark
No ratings yet
Integration of Python With Hadoop and Spark
13 pages
Database:: Introduction To Database: A. B. C. D
No ratings yet
Database:: Introduction To Database: A. B. C. D
4 pages
Software Testing Expertise
No ratings yet
Software Testing Expertise
4 pages
Top 5 Skills For 2025 - Part 1
No ratings yet
Top 5 Skills For 2025 - Part 1
12 pages
Enterprise Architecture
100% (7)
Enterprise Architecture
910 pages
AWS EC2 Setup Guide for Beginners
No ratings yet
AWS EC2 Setup Guide for Beginners
1 page
Storage Deployment Engineer Interview Questions
No ratings yet
Storage Deployment Engineer Interview Questions
2 pages
Cake PHP Cookbook
No ratings yet
Cake PHP Cookbook
838 pages
Rissa Affulaila Nurfitria Rizki Yudhi Dewantara: Jurnal Administrasi Bisnis (JAB) - Vol. 64 No. 1 November 2018
No ratings yet
Rissa Affulaila Nurfitria Rizki Yudhi Dewantara: Jurnal Administrasi Bisnis (JAB) - Vol. 64 No. 1 November 2018
8 pages
IT Security Expert Profile
No ratings yet
IT Security Expert Profile
4 pages
CEI 9 Months - Track Overview 1.0
No ratings yet
CEI 9 Months - Track Overview 1.0
4 pages
Software Engineering ASSIGNMENT QUESTION
100% (1)
Software Engineering ASSIGNMENT QUESTION
5 pages

2-Spring24 NoSQL Systems

Uploaded by

2-Spring24 NoSQL Systems

Uploaded by

NoSQL Systems

● good for handling transactional workloads involving small amounts of

● Relational databases generally require data to adhere to a schema.

● One of the simplest NoSQL databases.

● A document-oriented database extends the concept of a key-value

● Automatic sharding beneﬁts:

! python -m pip install pymongo==3.7.2

mydb.create_collection( 'addressbook ')

# Set the collection to work with

collection.insert_one({ 'name' : 'Ali'})

list ( collection.find( {'age' : {'$lt':30}}, {'name' : 1,'age':1,'_id':0 } ))

.sort([('name', 1), ('age', -1)])

● You can use aggregation operations to:

A pipeline consists of one or more stages that process documents

● High data. compression and help storage capacity to be used more

You might also like