KEMBAR78
Distributed Systems CH7-2022 | PDF | File System | Computer Network
0% found this document useful (0 votes)
38 views15 pages

Distributed Systems CH7-2022

This document discusses data storage in the cloud and cloud-based databases. It begins by explaining cloud-based storage solutions and databases. It then discusses the benefits of cloud storage such as scalability, reliability, and reduced administration compared to on-premise solutions. The document also notes some limitations like potential performance issues and security concerns regarding data in the cloud. Case studies of specific cloud storage providers like Dropbox and storage architectures like Google File Systems are provided.

Uploaded by

Anis MOHAMMEDI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views15 pages

Distributed Systems CH7-2022

This document discusses data storage in the cloud and cloud-based databases. It begins by explaining cloud-based storage solutions and databases. It then discusses the benefits of cloud storage such as scalability, reliability, and reduced administration compared to on-premise solutions. The document also notes some limitations like potential performance issues and security concerns regarding data in the cloud. Case studies of specific cloud storage providers like Dropbox and storage architectures like Google File Systems are provided.

Uploaded by

Anis MOHAMMEDI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Data Storage in DS

Prof. Tahar Kechadi


School of Computer Science

Learning Objectives

Explain cloud-based storage solution

Explain cloud-based databases

Benefits and limitations

Case studies

1
Evolution of Network Storage
File server
n Server with large disk capacity
n Sharing, replication and storage of large files
Storage-area networks (SAN)
n Storage devices connected directly to network

Network-attached storage (NAS)


Cloud-based Data Storage

NAS

LAN

Win2k Linux Unix


Generic Generic

Application
Servers
NAS Appliances

2
SAN Architecture
Interconnection
n Fibre Channel
n iSCSI protocol
Internet Small Computer System Interface
Network standard for linking data storage facilities
Enable the transfer of SCSI packets over a TCP/IP (Ethernet) network
Hard Drives
n The Logical Block Addressing (LBA)
n File Systems

Hard Drive File Systems


FAT (File Allocation Table)
Cluster Free / Next / Final LBA

0 1 0

1 5 4
items.txt

10 14 40

11 free 44

12 free 48

13 free 52

14 final 56

62 free 248

63 free 252

3
RAIDs
RAID
n Redundant Array of Inexpensive Disks
RAID Access
n Reading/writing information from a set of disks at the same time
Reliability
n Add parity and/or mirroring information on multiple disks of the array
Performance
n Improving performance and/or reliability of the storage device
Configuration
n RAID 0, RAID 1, RAID 2, RAID 3, etc.
7

Examples of RAIDs
RAID 0

RAID 5

RAID 1

4
Advantages of SANs
Reliability
n Data striping across multiple volumes
n Reconstruction of the file content
Performance
n Less system overhead
Compatibility
n Support common file systems
Backup
n Ease of performing backups

Cloud-Based Data Storage


Data storage resides in the cloud
Data Access
n Web browser interface
n Mounted disk drive: appear locally
n Set of API calls
Examples
n Dropbox, Google Drive, OneDrive, HomePipe, etc.

10

5
JustCloud
Unlimited Cloud Storage
Access Files Anywhere
Sync Multiple Computers
Share files
Sync Folder, Backup file
Data Security: 256-bit
Mobile Apps, Tracking …
Free Account: 15Mb storage, 50 files, 14
days
Personal, Business accounts

12

Carbonite
Unlimited Cloud Storage
Access Files Anywhere, Sync Multiple Computers
Share files, Sync Folder, Backup file
Data Security: 128-bit
Free Account: 15days
Personal, Pro, Server

13

6
Cloud-Based Data Storage Advantages
Scalability
n Scale storage capacity (up or down)
Pay as you use
Reliability
n Transparent data replication
Ease of access
n Support web-based access
Ease of use
n Remote file storage area -> logical drive

14

Cloud-Based Data Storage Disadvantages

Performance
n Data accessed over the Internet
Security
n Data in the cloud?
n Encrypt the files, (BoxCryptor)
Data orphans
n Abandon data in cloud storage facilities -> confidential data at risk

15

7
Cloud-based Backup Systems
Data backup
n Encrypted format
Scheduling
n When backup operations are to occur
Retrieving
n Retrieving backup files easily
Support multi platforms

17

Industry-Specific: Example
Different data storage and access requirements
Healthcare Industry
n Secure electronic medical records

Example: MS HealthVault
n Store medical records, prescriptions, measurements
n Share to GP, healthcare personnel, family members
n Set an expiration date

18

8
Understanding File Systems
OS File systems (FS)
n Handling storage, retrieval of files to/from a local disk
n File operations: copy, delete, create, move,…
Network File Systems (NFS)
n Handling files residing on devices across the network
Cloud File Systems (CFS)
n Handling files residing on the cloud

19

NFS
Network File System

20

9
NFS

21

Google File Systems (GFS)


A scalable distributed file system for large distributed data
intensive applications

Large, distributed, highly fault-tolerant file system

Multiple GFS clusters currently have:


n 1000+ storage nodes
n 300+ TeraBytes of disk storage
n Heavily accessed by hundreds of clients on distinct machines

22

10
GFS Architecture
A cluster consists of a single master & multiple chunk-servers & is
accessed by multiple clients

23

GFS Master
Maintains all file system metadata
n names space, access control info, file to chunk mappings, chunk (including
replicas) location, etc.
Periodically communicates with chunk-servers in HeartBeat messages to
give instructions and check state
Read/write: client contacts Master to get chunk locations, then deals
directly with chunk-servers

24

11
GFS Chunk-server
Files are broken into chunks. Each chunk has an immutable
globally unique 64-bit chunk-handle
n handle is assigned by the master at chunk creation
Chunk size is 64 MB
Each chunk is replicated on 3 (default) servers

25

GFS Client

Linked to apps using the file system API


Communicates with master and chunk-servers for reading
and writing
n Master interactions only for metadata
n Chunk-server interactions for data
Only caches metadata information
n Data is too large to cache

26

12
GFS Chunk Location
Master
n does not keep a persistent record of locations of chunks and replicas
Chunk-Servers
n Master polls chunk-servers at startup and when chunk-servers join or
leave
HeartBeat Messages
n Stays up to date by controlling placement of new chunks and through
HeartBeat messages (when monitoring chunk-servers)

27

Cloud-based Databases
Databases
n Used by applications residing in the cloud
n Used by applications residing within the customer’s data centre

28

13
Cloud-Based Databases Advantages
Cost-effective database scalability
n Scale dynamically
n Pay-as-you-go
High availability
n Reside on redundant hardware
High data redundancy
n DB is replicated
Reduced administration
n Maintain the database updates and patches

29

Cloud-Based Databases Disadvantages

Data security concerns


n …
Performance
n Data queries travel through the Internet

30

14
Cloud-Based Block Storage

Block of data storage


n Fixed-size of sequence of bits
n Size of block corresponds to an underlying unit of storage
n Applications with very large blocks of data
Cloud-based block storage device
n Amazon ESB
Block size up to a terabyte
Reliable, scalable

31

Go raibh maith agat

32

15

You might also like