KEMBAR78
introduction to NOSQL Database | PPTX
Introduction to NOSQL database
Content
• History of database technology
• Motivation for Nosql databases
• Benefits of Relational databases
• Limitations of RDBMS
• What is Nosql
• CAP Theorm
• Types of Nosql
• Research Challanges
• Conclusion
• References
Summary of early database systems
Database
name
Advantage Limitations
Flat file systems • Keep data about a single entity
together in a single record.
• Can leads to duplicated data and
inefficient retrieval.
• Difficult to implement security controls
to protect confidential data
Hierarchical data
management
systems
• Follows parent-child
relationships
• Stores records in the form of
records.
searching for a customer in a loan
database might require scanning all
customer records.
Network data
management
• The ability to represent parent-
child and many-to-many
relations is an advantage.
• Duplicate data,
• Difficulty implementing security,
Inefficient searching
• Difficulty maintaining program
• code to access databases
Relational Database
Management
Systems
• Store in the form of table.
• Normalization
• ACID property
• SQL
• Storage management programs
• Memory management programs
• Data dictionary
• Query language
Motivations for No SQL Databases
• Big data supports storing and querying huge
amounts of semi-structured and unstructured data.
• e.g Facebook and google stores and process
exabyte and zettabyte of data.
• So we need databases which can provide:
Scalability(Sharding)
 Cost
 Flexibility
 Availability
Source:http://programming4.us/enterprise/18762.aspx
Benefits of Relational databases
• Based on ACID
properties
• Strong consistency,
concurrency, recovery
• Normalization
• Standard Query
language (SQL)
• Vertical scaling (up
scaling)
Limitations of RDBMS
 Relational databases
were not built for
distributed applications.
Because:
• Joins are expensive
• Hard to scale
horizontally
• Expensive (product cost,
hardware)
 The rise of big data
(volume,Variety)
https://www.slideshare.net/ramakantsoni/p
resentation-on-no-sql
What is NoSQL
• It stands for 'NOT ONLY SQL'
• #NoSQL was a twitter hashtag for a conference in 2009
used by ERIC EVANS.
• There is no strict defination for NoSQL databases.
• It is a nonrelational database.
• Mainly designed to use for Big Data and Real time web
applications.
Advantages of Nosql over RDBMS
• Can handle Semi-structured
and unstructured data.
• Data Models- No Predefined
Schema.
• Scaling- Scaling out/
Horizontal Scaling.
• Avoids overhead of ACID
transactions.
• Avoids complexity of SQL
query.
Source:https://deavid.wordpress.com/2018
/08/29/nosql-databases
CAP Theorem
• Consistency: Clients should read the
same data.
• Availability: Data to be available all
time.
• Partial Tolerance: Data to be
partitioned across network segments
due to network failures.
Source:https://www.researchgate.net/figure/CAP-
theorem
NoSQL Types
It can be classified into
four types:
Key Value pair based
Column based
Document based
Graph based
Source:https://deavid.wordpress.com/2018/08/29/nosql-databases
Key Value Pair Based
• Data model: (key, value) pairs.
• Designed for processing dictionary.
• Dictionaries contain a collection of
records having fields containing data.
• Records are stored and retrieved
using a key that uniquely identifies
the record,and is used to quickly find
the data with in the database.
• Example: Oracle NoSQL Database,
Riak etc.
• We use it for: storing session
information, user profiles,
preferences , shopping cart data. Key Value Pair
Based[1].
Column based
• It store data as Column families
containing rows that have many
columns associated with a row
key.Each row can have different
columns.
• Column families are groups of
related data that is accessed
together.
• Example:Cassandra, HBase,
Hypertable, and Amazon
DynamoDB.
• We use it for content
management systems ,blogging
platforms, log aggregation.
Column based [1]
Document Based
• The database store send retrieves
documents. It stores documents in
the value part of the key-value
store.
• Self describing, hierarchical tree
data structures consisting of
maps, collections, and scalar
values.
• Example: Lotus Notes ,Mongo
DB, Couch DB, Orient DB, Raven
DB.
• We use it for content management
systems, blogging platforms,
webanalytics,real-timeanalytics e-
commerce applications.
Document Based [1]
Graph Based
• Store entities and relationships
between these entities as nodes
and edges of a graph respectively.
Entities have properties.
• Traversing the relationships is
very fast as relationship between
nodes is not calculated at query
time but is actually persisted as a
relationship.
• Example: Neo4J, InfiniteGraph,
OrientDB, FlockDB.
• It is well suited for connected data
such as social networks ,spatial
data ,routing information for goods
and supply.
Graph Based[1]
Research Challenges
• Transaction Processing : Nosql do not strictly follow ACID
properties.
• Query Processing: There is no user friendly unified query
language for Nosql.
• Security: Since follow unstructured data approach along
with geographic distribution. Hence its very difficult to
apply security.
Conclusion
• RDBMS is a great tool for
solving ACID problems
• When data validity is super
important
• When you need to support
dynamic queries
• NoSQL is a great tool for
solving data availability
problems
• When it’s more important
to have fast data than right
data
• When you need to scale
based on changing
requirements
References
1. Ali Davoudian and Liu Chen, Mengchi Liu:A Survey on NoSQL
Stores,ACM Comput. Surv. 51, 2, Article,40 (April 2018),43 pages.
2. Dan Sullivan:Nosql for mere Mortals,1st Edition,United States of
America:Pearson Education,2015.
3. Xiangdong Huang, Jianmin Wang, Yu Zhong, Shaoxu Song, and
Philip S. Yu. 2015:Optimizing data partition for scaling out NoSQL
cluster. Concurrency and Computation: Practice and Experience 27,
18, 5793–5809.
4. Katarina Grolinger, Wilson A HigashinoEmail author, Abhinav Tiwari
and Miriam AM Capretz:Data management in cloud environments:
NoSQL and NewSQL data stores,l. Journal of Cloud Computing:
Advances, Systems and Applications,Springer. 2013
Thank You

introduction to NOSQL Database

  • 1.
  • 2.
    Content • History ofdatabase technology • Motivation for Nosql databases • Benefits of Relational databases • Limitations of RDBMS • What is Nosql • CAP Theorm • Types of Nosql • Research Challanges • Conclusion • References
  • 3.
    Summary of earlydatabase systems Database name Advantage Limitations Flat file systems • Keep data about a single entity together in a single record. • Can leads to duplicated data and inefficient retrieval. • Difficult to implement security controls to protect confidential data Hierarchical data management systems • Follows parent-child relationships • Stores records in the form of records. searching for a customer in a loan database might require scanning all customer records. Network data management • The ability to represent parent- child and many-to-many relations is an advantage. • Duplicate data, • Difficulty implementing security, Inefficient searching • Difficulty maintaining program • code to access databases Relational Database Management Systems • Store in the form of table. • Normalization • ACID property • SQL • Storage management programs • Memory management programs • Data dictionary • Query language
  • 4.
    Motivations for NoSQL Databases • Big data supports storing and querying huge amounts of semi-structured and unstructured data. • e.g Facebook and google stores and process exabyte and zettabyte of data. • So we need databases which can provide: Scalability(Sharding)  Cost  Flexibility  Availability
  • 5.
  • 6.
    Benefits of Relationaldatabases • Based on ACID properties • Strong consistency, concurrency, recovery • Normalization • Standard Query language (SQL) • Vertical scaling (up scaling)
  • 7.
    Limitations of RDBMS Relational databases were not built for distributed applications. Because: • Joins are expensive • Hard to scale horizontally • Expensive (product cost, hardware)  The rise of big data (volume,Variety) https://www.slideshare.net/ramakantsoni/p resentation-on-no-sql
  • 8.
    What is NoSQL •It stands for 'NOT ONLY SQL' • #NoSQL was a twitter hashtag for a conference in 2009 used by ERIC EVANS. • There is no strict defination for NoSQL databases. • It is a nonrelational database. • Mainly designed to use for Big Data and Real time web applications.
  • 9.
    Advantages of Nosqlover RDBMS • Can handle Semi-structured and unstructured data. • Data Models- No Predefined Schema. • Scaling- Scaling out/ Horizontal Scaling. • Avoids overhead of ACID transactions. • Avoids complexity of SQL query. Source:https://deavid.wordpress.com/2018 /08/29/nosql-databases
  • 10.
    CAP Theorem • Consistency:Clients should read the same data. • Availability: Data to be available all time. • Partial Tolerance: Data to be partitioned across network segments due to network failures. Source:https://www.researchgate.net/figure/CAP- theorem
  • 11.
    NoSQL Types It canbe classified into four types: Key Value pair based Column based Document based Graph based Source:https://deavid.wordpress.com/2018/08/29/nosql-databases
  • 12.
    Key Value PairBased • Data model: (key, value) pairs. • Designed for processing dictionary. • Dictionaries contain a collection of records having fields containing data. • Records are stored and retrieved using a key that uniquely identifies the record,and is used to quickly find the data with in the database. • Example: Oracle NoSQL Database, Riak etc. • We use it for: storing session information, user profiles, preferences , shopping cart data. Key Value Pair Based[1].
  • 13.
    Column based • Itstore data as Column families containing rows that have many columns associated with a row key.Each row can have different columns. • Column families are groups of related data that is accessed together. • Example:Cassandra, HBase, Hypertable, and Amazon DynamoDB. • We use it for content management systems ,blogging platforms, log aggregation. Column based [1]
  • 14.
    Document Based • Thedatabase store send retrieves documents. It stores documents in the value part of the key-value store. • Self describing, hierarchical tree data structures consisting of maps, collections, and scalar values. • Example: Lotus Notes ,Mongo DB, Couch DB, Orient DB, Raven DB. • We use it for content management systems, blogging platforms, webanalytics,real-timeanalytics e- commerce applications. Document Based [1]
  • 15.
    Graph Based • Storeentities and relationships between these entities as nodes and edges of a graph respectively. Entities have properties. • Traversing the relationships is very fast as relationship between nodes is not calculated at query time but is actually persisted as a relationship. • Example: Neo4J, InfiniteGraph, OrientDB, FlockDB. • It is well suited for connected data such as social networks ,spatial data ,routing information for goods and supply. Graph Based[1]
  • 16.
    Research Challenges • TransactionProcessing : Nosql do not strictly follow ACID properties. • Query Processing: There is no user friendly unified query language for Nosql. • Security: Since follow unstructured data approach along with geographic distribution. Hence its very difficult to apply security.
  • 17.
    Conclusion • RDBMS isa great tool for solving ACID problems • When data validity is super important • When you need to support dynamic queries • NoSQL is a great tool for solving data availability problems • When it’s more important to have fast data than right data • When you need to scale based on changing requirements
  • 18.
    References 1. Ali Davoudianand Liu Chen, Mengchi Liu:A Survey on NoSQL Stores,ACM Comput. Surv. 51, 2, Article,40 (April 2018),43 pages. 2. Dan Sullivan:Nosql for mere Mortals,1st Edition,United States of America:Pearson Education,2015. 3. Xiangdong Huang, Jianmin Wang, Yu Zhong, Shaoxu Song, and Philip S. Yu. 2015:Optimizing data partition for scaling out NoSQL cluster. Concurrency and Computation: Practice and Experience 27, 18, 5793–5809. 4. Katarina Grolinger, Wilson A HigashinoEmail author, Abhinav Tiwari and Miriam AM Capretz:Data management in cloud environments: NoSQL and NewSQL data stores,l. Journal of Cloud Computing: Advances, Systems and Applications,Springer. 2013
  • 19.