0% found this document useful (0 votes)

30 views12 pages

Column-vs-Row Databases

The document compares column-stores and row-stores, highlighting their differences in data storage, efficiency for read-only queries, and optimization techniques. It discusses the advantages of column-stores in terms of I/O efficiency and compression, while also noting potential drawbacks for certain applications. Additionally, it explores methods to emulate column-store benefits in row-stores through vertical partitioning and materialized views.

Uploaded by

Usama Riaz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views12 pages

Column-vs-Row Databases

Uploaded by

Usama Riaz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

11/30/2024

Column-Stores vs. Row-Stores

Contents

 Column-store introduction
 Column-store data model
 Emulation of Column Store in Row store
 Column store optimization
 Experiment and Results
 Conclusion

1
11/30/2024

Row Store and Column Store

Figure taken form [2]

 In row store data are stored in the disk tuple by tuple.

 Where in column store data are stored in the disk
column by column

Row Store and Column Store

 Most of the queries does not process all the attributes of a
particular relation.

 For example the query

Select c.name and c.address
From CUSTOMES as c
Where c.region=abc;

 Only process three attributes of the relation CUSTOMER. But

the customer relation can have more than three attributes.

 Column-stores are more I/O efficient for read-only queries as

they read, only those attributes which are accessed by a query.

2
11/30/2024

Row Store and Column Store

Row Store Column Store

(+) Easy to add/modify a record (+) Only need to read in relevant data

(-) Might read in unnecessary data (-) Tuple writes require multiple accesses

 So column stores are suitable for read-mostly, read-

intensive, large data repositories

Why Column Stores?

 Can be significantly faster than row stores for some
applications
 Fetch only required columns for a query
 Better cache effects
 Better compression (similar attribute values within a column)
 But can be slower for other applications
 OLTP with many row inserts, ..

3
11/30/2024

Column Stores - Data Model

 Standard relational logical data model

 EMP(name, age, salary, dept)
 DEPT(dname, floor)
 Table – collection of projections
 Projection – set of columns
 Horizontally partitioned into segments with segment
identifier

Column Stores - Data Model

 To answer queries, projections are joined using Storage

keys and join indexes
 Storage Keys:
 Within a segment, every data value of every column is
associated with a unique Skey
 Values from different columns with matching Skey belong to
the same logical row

4
11/30/2024

Column Stores – Data Model

 Join Indexes
 T1 and T2 are projections on T
 M segments in T1 and N segments in T2
 Join Index from T1 to T2 is a table of the form:
 (s: Segment ID in T2, k: Storage key in Segment s)
 Each row in join index matches corresponding row in T1
 Join indexes are built such that T could be efficiently
reconstructed from T1 and T2

Compression
 Trades I/O for CPU
 Increased column-store opportunities:
 Higher data value locality in column stores
 Techniques such as run length encoding far more useful
 Schemes
 Null Suppression
 Dictionary encoding
 Run Length encoding
 Bit-Vector encoding
 Heavyweight schemes

5
11/30/2024

Query Execution - Operators

 Select: Same as relational algebra, but produces a bit

string
 Project: Same as relational algebra
 Join: Joins projections according to predicates

 Aggregation: SQL like aggregates

 Sort: Sort all columns of a projection

Query Execution - Operators

 Decompress: Converts compressed column to
uncompressed representation
 Mask(Bitstring B, Projection Cs) => emit only those
values whose corresponding bits are 1
 Concat: Combines one or more projections sorted in
the same order into a single projection
 Permute: Permutes a projection according to the
ordering defined by a join index
 Bitstring operators: Band – Bitwise AND, Bor – Bitwise
OR, Bnot – complement

6
11/30/2024

Row Store Vs Column Store

 Now the simplistic view about the difference in storage
layout leads to that one can obtain the performance
beneﬁts of a column-store using a row-store by making
some changes to the physical structure of the row store.

 This changes can be

 Vertically partitioning
 Using index-only plans
 Using materialized views

Vertical Partitioning
 Process:
 Full Vertical partitioning of each relation
 Each column =1 Physical table
 This can be achieved by adding integer position column to every table
 Adding integer position is better than adding primary key
 Join on Position for multi column fetch

 Problems:
 “Position” - Space and disk bandwidth
 Header for every tuple – further space wastage
 e.g. 24 byte overhead in PostgreSQL

7
11/30/2024

Index-only plans: Example

Materialized Views
 Process:
 Create ‘optimal' set of MVs for given query workload
 Objective:
 Provide just the required data
 Avoid overheads
 Performs better
 Expected to perform better than other two approach

 Problems:
 Practical only in limited situation
 Require knowledge of query workloads in advance

8
11/30/2024

Materialized Views: Example

 Select F.custID
from Facts as F
where F.price>20

Optimizing Column oriented Execution

 Different optimization for column oriented database

 Compression
 Late Materialization
 Block Iteration
 Invisible Join

9
11/30/2024

Compression

 Low information entropy (high data value locality) leads

to High compression ratio
 Advantage
 Disk Space is saved
 Less I/O
 CPU cost decrease if we can perform operation without
decompressing
 Light weight compression schemes do better

Compression

 If data is sorted on one column that column will be

super-compressible in row store

 eg. Run length encoding

Figure taken form [2]

10
11/30/2024

Late Materialization

 Most query results entity-at-a-time not column-at-a-time

 So at some point of time multiple column must be
combined
 One simple approach is to join the columns relevant for a
particular query
 But further performance can be improve using late-
materialization

Late Materialization

 Delay Tuple Construction

 Might avoid constructing it altogether
 Intermediate position lists might need to be constructed
 Eg: SELECT R.a FROM R WHERE R.c = 5 AND R.b = 10
 Output of each predicate is a bit string
 Perform Bitwise AND
 Use final position list to extract R.a

11
11/30/2024

Late Materialization

 Advantages
 Unnecessary construction of tuple is avoided
 Direct operation on compressed data
 Cache performance is improved (PAX)

Thank You!

Column Vs Row
No ratings yet
Column Vs Row
64 pages
Column vs. Row Stores: A Deep Dive
No ratings yet
Column vs. Row Stores: A Deep Dive
64 pages
Column Vs Row
No ratings yet
Column Vs Row
64 pages
Lec 23
No ratings yet
Lec 23
28 pages
Column vs. Row DBMS Optimization
No ratings yet
Column vs. Row DBMS Optimization
3 pages
Chapter 03
No ratings yet
Chapter 03
43 pages
DB2 Database Tutorial Guide
No ratings yet
DB2 Database Tutorial Guide
28 pages
DB2 Database Systems Overview
No ratings yet
DB2 Database Systems Overview
16 pages
Real-Time Column Store for Streaming
No ratings yet
Real-Time Column Store for Streaming
13 pages
DCICT 2 Databases
No ratings yet
DCICT 2 Databases
26 pages
Hash Tables and Query Execution: March 1st, 2004
No ratings yet
Hash Tables and Query Execution: March 1st, 2004
32 pages
RDBMS
No ratings yet
RDBMS
12 pages
DB2 Refresher: by Parvathavardhini Sathya
No ratings yet
DB2 Refresher: by Parvathavardhini Sathya
104 pages
Bajwa A C
No ratings yet
Bajwa A C
4 pages
Database Concepts for Students
No ratings yet
Database Concepts for Students
3 pages
Lecture 3: Business Intelligence: OLAP, Data Warehouse, and Column Store
No ratings yet
Lecture 3: Business Intelligence: OLAP, Data Warehouse, and Column Store
119 pages
Database Management System: Introduction To DBMS Ms. Deepikkaa.S
No ratings yet
Database Management System: Introduction To DBMS Ms. Deepikkaa.S
45 pages
Database Concepts for Beginners
No ratings yet
Database Concepts for Beginners
7 pages
Dbmsconcepts Intro
No ratings yet
Dbmsconcepts Intro
50 pages
DBMS SRP
No ratings yet
DBMS SRP
13 pages
ADB - CH2 - Advanced SQL
No ratings yet
ADB - CH2 - Advanced SQL
60 pages
Advanced Database Concepts
No ratings yet
Advanced Database Concepts
27 pages
DBMS by Hassan Malik
No ratings yet
DBMS by Hassan Malik
6 pages
cs186 Notes
No ratings yet
cs186 Notes
31 pages
Database Management Concepts
No ratings yet
Database Management Concepts
21 pages
Data and Databases
No ratings yet
Data and Databases
9 pages
Wa0002
No ratings yet
Wa0002
11 pages
SC4x W3L1 TopicsInDatabases v2
No ratings yet
SC4x W3L1 TopicsInDatabases v2
37 pages
Primers - RDBMS My SQL
No ratings yet
Primers - RDBMS My SQL
105 pages
Column Store Databases
No ratings yet
Column Store Databases
7 pages
DBMS Unit 4
No ratings yet
DBMS Unit 4
22 pages
DBMS
No ratings yet
DBMS
4 pages
Big Data Engineering Interview Questions
67% (3)
Big Data Engineering Interview Questions
189 pages
Chapter03 Updated
No ratings yet
Chapter03 Updated
70 pages
Data Modeling
No ratings yet
Data Modeling
43 pages
SQL Material
No ratings yet
SQL Material
47 pages
SQL and Relational Databeases
No ratings yet
SQL and Relational Databeases
27 pages
Database Management Systems All Weeks
No ratings yet
Database Management Systems All Weeks
77 pages
CS121 Lec 04
No ratings yet
CS121 Lec 04
44 pages
Interview Q & A (SQL Spark HIVE Airflow AWS Kafka) - 1
No ratings yet
Interview Q & A (SQL Spark HIVE Airflow AWS Kafka) - 1
25 pages
Data Engineering Interview Question and Ans Chatgpt
No ratings yet
Data Engineering Interview Question and Ans Chatgpt
21 pages
Dbms CT
No ratings yet
Dbms CT
6 pages
06 Storage3
No ratings yet
06 Storage3
5 pages
CEF342 - Database and Design Chapter 3 - The Relational Database Model
No ratings yet
CEF342 - Database and Design Chapter 3 - The Relational Database Model
10 pages
DBMS - Part 1 - Introduction
No ratings yet
DBMS - Part 1 - Introduction
45 pages
Database Analysis & Design
No ratings yet
Database Analysis & Design
57 pages
Dbms 1
No ratings yet
Dbms 1
23 pages
A Study On Databases: Columnar
No ratings yet
A Study On Databases: Columnar
5 pages
SQL Mian
No ratings yet
SQL Mian
205 pages
SQL Basics: Tables and Queries
No ratings yet
SQL Basics: Tables and Queries
39 pages
CS614 Finalterm Subjective Referencefile
No ratings yet
CS614 Finalterm Subjective Referencefile
27 pages
6) Object Relational Model: Component of The SQL Standard
No ratings yet
6) Object Relational Model: Component of The SQL Standard
11 pages
Introduction To Database Systems: Email: Muhammad - Yaseen@riphah - Edu.pk
No ratings yet
Introduction To Database Systems: Email: Muhammad - Yaseen@riphah - Edu.pk
29 pages
Study of Hadoop Features For Large Scale Data: Dipali Salunkhe, Devendra Bahirat, Neha V. Koushik, Deepali Javale
No ratings yet
Study of Hadoop Features For Large Scale Data: Dipali Salunkhe, Devendra Bahirat, Neha V. Koushik, Deepali Javale
4 pages
Dataguard Questions
No ratings yet
Dataguard Questions
11 pages
Course Slides - Tableau Fundamentals
No ratings yet
Course Slides - Tableau Fundamentals
109 pages
Snowflake Pro Core Certification Exam Questions
No ratings yet
Snowflake Pro Core Certification Exam Questions
2 pages
AIX Disk Cloning Guide FAQ
No ratings yet
AIX Disk Cloning Guide FAQ
2 pages
Attunity Streaming Change Data Capture Ebook
0% (1)
Attunity Streaming Change Data Capture Ebook
54 pages
Technical Skills Enhancement - PL/SQL Best Practices Oracle Architecture
No ratings yet
Technical Skills Enhancement - PL/SQL Best Practices Oracle Architecture
35 pages
Ex2 - Worldwide Emission
No ratings yet
Ex2 - Worldwide Emission
21 pages
r57 PHP
No ratings yet
r57 PHP
45 pages
DBMS Manual (P) 2023-24
No ratings yet
DBMS Manual (P) 2023-24
151 pages
KZ 1100
No ratings yet
KZ 1100
370 pages
Emailing DBMS - QB - Shubhammarotkar Toc Notes
No ratings yet
Emailing DBMS - QB - Shubhammarotkar Toc Notes
14 pages
Acronis® Backup & Recovery™ 10 Online Backup Evaluation Guide
No ratings yet
Acronis® Backup & Recovery™ 10 Online Backup Evaluation Guide
9 pages
I3306-chap2-TD2-EN - Fa23-24-Solution
No ratings yet
I3306-chap2-TD2-EN - Fa23-24-Solution
6 pages
Daily Dba Scripts
No ratings yet
Daily Dba Scripts
4 pages
Cloning Oracle E-Business Suite Release 12.2 With Rapid Clone (Doc ID 1383621.1)
100% (1)
Cloning Oracle E-Business Suite Release 12.2 With Rapid Clone (Doc ID 1383621.1)
31 pages
Post-Quiz - Attempt Review Function
No ratings yet
Post-Quiz - Attempt Review Function
3 pages
Steps For Creating Spring Boot Application
No ratings yet
Steps For Creating Spring Boot Application
3 pages
Amber SQL
No ratings yet
Amber SQL
6 pages
DB Marking Scheme PDF
No ratings yet
DB Marking Scheme PDF
1 page
DB Prelimexam
0% (1)
DB Prelimexam
4 pages
Data Warehouse
No ratings yet
Data Warehouse
33 pages
Tiksha Kakade's Resume
No ratings yet
Tiksha Kakade's Resume
2 pages
SQLMAP
No ratings yet
SQLMAP
5 pages
Agents in LangChain
100% (2)
Agents in LangChain
11 pages
DATA Management Concepts
No ratings yet
DATA Management Concepts
10 pages
14 - Gaddis Python - Lecture - PPT - ch14
No ratings yet
14 - Gaddis Python - Lecture - PPT - ch14
37 pages
SQL For Data Analysis
No ratings yet
SQL For Data Analysis
236 pages
Challenge Yourself 23
No ratings yet
Challenge Yourself 23
3 pages
(I.P) Information Practices Project File For CLASS 11th DAV SCHOOL
No ratings yet
(I.P) Information Practices Project File For CLASS 11th DAV SCHOOL
28 pages

Column-vs-Row Databases

Uploaded by

Column-vs-Row Databases

Uploaded by

11/30/2024

Column-Stores vs. Row-Stores

Row Store and Column Store

Figure taken form [2]

 In row store data are stored in the disk tuple by tuple.

Row Store and Column Store

 For example the query

 Only process three attributes of the relation CUSTOMER. But

 Column-stores are more I/O efficient for read-only queries as

Row Store and Column Store

Row Store Column Store

 So column stores are suitable for read-mostly, read-

Why Column Stores?

Column Stores - Data Model

 Standard relational logical data model

Column Stores - Data Model

 To answer queries, projections are joined using Storage

Column Stores – Data Model

Query Execution - Operators

 Select: Same as relational algebra, but produces a bit

 Aggregation: SQL like aggregates

Query Execution - Operators

Row Store Vs Column Store

 This changes can be

Index-only plans: Example

Materialized Views: Example

Optimizing Column oriented Execution

 Different optimization for column oriented database

 Low information entropy (high data value locality) leads

 If data is sorted on one column that column will be

 eg. Run length encoding

Figure taken form [2]

 Most query results entity-at-a-time not column-at-a-time

 Delay Tuple Construction

You might also like