Lecture 2
Data Modeling
Data Architecture
Data Modeling
“Data Modeling is an abstraction that organizes
elements of data and how they will relate to each
other”
– Wikipedia
Example: Spreadsheets for household
• You define rows and columns
• You structure your data
Process of Data Modeling
The process of data modeling is to
• Organize data into databases.
• To ensure that your data is persistent.
• To ensure that it is easily useable by you and your
organization.
Data Modeling is also called database modeling.
Data Modeling
• Process to support business and
user applications
• Gather requirements
• Conceptual Data Modeling
• Logical Data Modeling
• Physical Data Modeling
Conceptual Data Modeling
• Offers a big view picture of the
business structure
• Created as part of the process of
gathering initial project
requirements
• Typically includes entity classes,
their characteristics and
constraints and the relationships
between them
Logical Data Modeling
• Greater detail about the system
• More concerned about system
implementation
• Data attributes in each entity are
defined
• Data attributes, such as data
types and lengths and
relationships between entities are
indicated
Physical Data Modeling
• Demonstrates the low-level
implementation details
• A finalized design is offered
containing data types, primary
and foreign keys
• Can include DBMS-specific
properties, including performance
tuning.
Types of Data Modeling
• Hierarchical Data Models
• Relationships represented in a tree-like format
• Each record has a single root/parent and maps to child tables
Types of Data Modeling
• Relational Data Models
• Data segments are explicitly joined through the use of tables,
reducing database complexity.
Types of Data Modeling
• Graph Data Models
• Based on Graph Theory
• Nodes and Edges in a graph are used to represent data
Why is data modeling important?
(Atomicity, Consistency, Isolation,
Durability)
PostgreSQL Pros and Cons
Pros
• This database management engine is scalable and can handle terabytes
of data.
• It supports JSON.
• There are a variety of predefined functions.
• A number of interfaces are available.
Cons
• Documentation can be spotty, so you may find yourself searching online
in an effort to figure out how to do something.
• Configuration can be confusing.
• Speed may suffer during large bulk operations or read queries.
Reference: https://www.keycdn.com/blog/popular-databases
Comparison of Postgres with SQLite and MySQL
Name SQLite MySQL PostgreSQL
Architecture File Based Client Server Client Server
Transactional ACID ACID ACID
consistency
Replication None Master-Slave Master-Slave Replication
Replication, Master-
Master Replication
Programming C, C++ C, C++ C
Language (Base
Code)
Popular Use-Cases Low-Medium Traffic Web Sites, Web Analytics, Data
Websites, IoT and Applications, LAMP Mining, Data
Embedded Devices, stack, OLTP-based Warehousing,
Testing and applications Business
Development Intelligence, Hadoop
Key Customers Adobe, Facebook, and GitHub, Facebook, and Cloudera, Instagram, and
Apple YouTube ViaSat
Reference: https://logz.io/blog/relational-database-comparison/
Famous Companies using PostgreSQL
c
Demo