Oracle DBAs
Deploying Highly
Available
SQL Server Systems
Joe Yong
Chief Architect
Scalability Experts Inc.
jyong@scalabilityexperts.com
About This Session
Goals
Overview of SQL Server 2005 High Availability
features
Drilldown on HA implementation strategies
Non-goals
Deep dive into SQL Server
Chest thumping
Make you a HA expert
Pre-requisites
Experience as an Oracle DBA, Architect or
Developer DBA
Basic experience in designing, deployment and
managing database systems that require
medium to high levels of availability
Agenda
What is High Availability
SQL Server 2005 HA overview
Solutions to common scenarios
Case study
Summary
What is High Availability
Uninterrupted usability
A running server is not necessarily available
Is a factor of technology, people and processes
Often measured as a percentage in uptime over
1 year
Eg. 99.999% uptime = 5.25 minutes downtime a year
Should includes both planned and unplanned downtime
but many only measure unplanned
You may not own every part of the equation but if
you have to specify your SLA
Example
Online ordering system requires orders to be confirmed in
30 seconds
Availability is impacted by application scalability, network
and database
Dont forget security impact on HA
Agenda
What is High Availability
SQL Server 2005 HA technologies
Hot Standby
Failover Clustering
Database Mirroring
Warm Standby
Database Mirroring High Protection / Performance mode
Replication
Log Shipping
Cold Standby
Backup/restore
Online operations
Solutions to common scenarios
Case study
Summary
Failover Cluster
Microsoft Failover Clustering
Overview
* Inst1
Hot Standby Automatic failover
Protects against local, limited disasters
Built on Microsoft Server Clusters (MSCS)
Multiple nodes provide availability, transparent to client
Supports 2, 4, or 8 nodes depending on OS edition
Automatic detection and failover
Requires cluster certified hardware; see Windows Catalog:
Clustered
Supports many scenarios: Multiple Active Instances, N+1, N+I
Up to 25 SQL Server instances per cluster
NOT a load balancing
solution
N+1: N Active, 1 Inactive
Multiple Active Instances
Instances
* Inst1
Inst3 *
Inst2 *
* Inst1
Inst2 *
N+I: N Active, I Inactive
Instances
Microsoft Failover Clustering
Detail
Geographically Dispersed
Clusters
Same functionality and behavior as standard failover
cluster
Protects against local, total and extended disasters
Requires specially certified cluster hardware from qualified
vendors
Requires guaranteed 500ms maximum round trip latency
between nodes
Site 1
Site 2
Network
SQL Server does not differentiate
between standard and
geo-cluster
N1
N2
N3
Storage Controller S2
Storage Controller S1
D1
D3
N4
Mirror
Mirror
D2
D4
High Availability Toolbox
Disaster Recovery
Features
Local limited
disasters
Local disasters
Extended disasters
Failover
Clusterin
g
With geoclusters
Limited
With geoclusters
Rapid failover
Failure detection
Automatic
switchover
Specialized
hardware
Secondary workload
on standby
(reporting)
Meta data support
Performance impact
Protect against
storage failure
None
Agenda
What is High Availability
SQL Server 2005 HA technologies
Hot Standby
Failover Clustering
Database Mirroring High Availability mode
Warm Standby
Database Mirroring High Protection / Performance mode
Replication
Log Shipping
Cold Standby
Backup/restore
Online operations
Solutions to common scenarios
Case study
Summary
Database Mirroring
Database Mirroring
Overview
Hot Standby
Provides a fault-tolerant database
Building block for complex topologies
Database Failover
Very fast failover
Less than five seconds in most cases
Zero data loss
Automatic or manual failover
Automatic re-sync after failover
Automatic, transparent client redirect
Works with standard certified servers, storage and
networks
No location limitations
No shared components; two separate copies of
data
Database Mirroring
How does it work?
Application
Mirror is always
redoing it
remains current
Witness
(optional)
Commit
Principal
Mirror
SQL Server
2
Log
>2
Data
SQL Server
3
Log
>3
Data
Database Mirroring
Witness
Only required for automatic failover;
Just another instance of SQL Server 2005
Can serve multiple sessions
Prevents split brain scenario
If partners do not see each other, is it due to
network failure or server failure?
To become Principal automatically, a server
must talk to at least one other server
Witness ONLY answers the question Who do
you see?, does not promote a server to be
Principal
Database Mirroring
High Availability mode
Safety Full; synchronous operation
Commit when logged on Mirror
Allows automatic failover
No data loss
Database available whenever quorum
exists
Formed by any two servers from the three;
Principal, Mirror, Witness
Witness is present automatic Failover
Database Mirroring
Transparent Client Redirect
SQLConnection object that targets a mirrored
database
No application code change required
Client automatically redirected if session is
dropped
Client library is aware of Principal and Mirror servers
Upon initial connect to Principal, library caches Mirror
name
When client attempts to reconnect
If Principal is available, connects
If not, client library automatically redirects connection to
Mirror
If Principal is down upon first connect attempt, connection
fails
Workaround via explicit coding or NLB type solution
High Availability Toolbox
Disaster Recovery
Features
Local limited
disasters
Local disasters
Extended disasters
Failover
Clusterin
g
Database
Mirroring
HA
mode
With geoclusters
Limited
With geoclusters
Rapid failover
Failure detection
Automatic
switchover
Specialized
hardware
Secondary workload
on standby
(reporting)
Meta data support
User DB only
None
Minimal ~
low
Performance impact
Protect against
With DB
snapshot
Agenda
What is High Availability
SQL Server 2005 HA technologies
Hot Standby
Failover Clustering
Database Mirroring High Availability mode
Warm Standby
Database Mirroring High Protection / Performance
mode
Replication
Log Shipping
Cold Standby
Backup/restore
Online operations
Solutions to common scenarios
Case study
Summary
Database Mirroring
High Protection mode
Safety Full; synchronous operation
Commit when logged on Mirror
No automatic failover; manual failover only
Database quorum formed by Principal and
Mirror
If Principal loses quorum, it stops servicing
the database
Ensures high protection; database is never in
exposed state
No Witness present no automatic failover
Database Mirroring
High Performance mode
Safety Off; asynchronous operation
Commit when logged on Principal
No automatic failover; manual failover only
Possible data loss
If Mirror becomes unavailable; Principal
continues working
If Principal becomes unavailable; Mirror can
assume workload
Manual failover to Mirror is required
No Witness present; no automatic failover
High Availability Toolbox
Disaster Recovery
Features
Local limited
disasters
Local disasters
Extended disasters
Failover
Clusterin
g
Database
Mirroring
HA
mode
With geoclusters
Limited
With geoclusters
Rapid failover
Database
Mirroring
H P/P
mode
Failure detection
Automatic
switchover
Specialized
hardware
Secondary workload
on standby
(reporting)
Meta data support
User DB only
User DB only
None
Minimal ~
low
Minimal
Performance impact
Protect against
With DB
snapshot
With DB
snapshot
Agenda
What is High Availability
SQL Server 2005 HA technologies
Hot Standby
Failover Clustering
Database Mirroring High Availability mode
Warm Standby
Database Mirroring High Protection / Performance mode
Replication
Log Shipping
Cold Standby
Backup/restore
Online operations
Solutions to common scenarios
Case study
Summary
Transactional Replication
Requires consideration at design time; cannot just
flip the switch
High performance latency measured in seconds
Some (minimal) load on the server
Can be implemented at database or table level
Failover possible; custom designed solution
Two types
Standard transactional replication
Easy to design, setup & manage
Subscriber (standby) can be used for reporting
Peer-to-peer transactional replication
Multi-master model; schema is identical on all sites
Supports distributed applications with data partitioning;
enables load balancing
Does not handle conflicts; design to avoid/prevent conflicts
Peer-To-Peer Transactional
Replication
How does it work?
West
East
Logreader
Agent
Dist
DB
Distribution
Agent
Logreader
Agent
South
Logreader
Agent
Dist
DB
Distribution
Agent
Dist
DB
Distribution
Agent
High Availability Toolbox
Disaster Recovery
Features
Local limited
disasters
Local disasters
Extended disasters
Failover
Clusterin
g
Database
Mirroring
HA
mode
With geoclusters
Limited
With geoclusters
Rapid failover
Database
Mirroring
H P/P
mode
Transaction
al
Replication
Failure detection
Automatic
switchover
Specialized
hardware
Secondary workload
on standby
(reporting)
Meta data support
User DB only
User DB only
Some
None
Minimal ~
low
Minimal
Low
Performance impact
With DB
snapshot
With DB
snapshot
Can automate
in application
Can automate
in application
Agenda
What is High Availability
SQL Server 2005 HA technologies
Hot Standby
Failover Clustering
Database Mirroring High Availability mode
Warm Standby
Database Mirroring High Protection / Performance mode
Replication
Log Shipping
Cold Standby
Backup/restore
Online operations
Solutions to common scenarios
Case study
Summary
Log Shipping
Backup transaction log, copy to secondary
server, restore transaction log backup
Failover is manual
Meta data management may be necessary
Read operations on secondary is permitted
Users are disconnected when log restore occurs
Can maintain multiple secondary servers
Optional Monitor server
Records history and status of backup/restore
jobs
May be setup to raise alerts when jobs fail
High Availability Toolbox
Disaster Recovery
Features
Local limited
disasters
Local disasters
Extended disasters
Failover
Clusterin
g
Database
Mirroring
HA
mode
Database
Mirroring
H P/P
mode
With geoclusters
Limited
With geoclusters
Rapid failover
Transaction
Log
al
Shipping
Replication
Failure detection
Automatic
switchover
Specialized
hardware
Secondary workload
on standby
(reporting)
Meta data support
User DB only
User DB only
Some
User DB
only
None
Minimal ~
low
Minimal
Low
Low
Performance impact
With DB
snapshot
With DB
snapshot
Can automate
in application
Can automate
in application
Can automate
in application
With
limitations
Agenda
What is High Availability
SQL Server 2005 HA technologies
Hot Standby
Failover Clustering
Database Mirroring High Availability mode
Warm Standby
Database Mirroring High Protection / Performance mode
Replication
Log Shipping
Cold Standby
Backup/restore
Online operations
Solutions to common scenarios
Case study
Summary
Backup and Restore
Slowest recovery (but also simplest)
Recommended as secondary or tertiary
protection layer
Manual failure detection and switchover
Data loss possible
Recommend maintaining active backups on
disk; duplicate, archive and offsite backups
on tape
Various levels
Database full, differential, partial, differential
partial, copy-only
File & filegroups full, differential
Backup and Restore
RESTORE VERIFY ONLY
Backup media mirroring
Backup and database page checksums
Fine grained online repair
Online restore
Piecemeal restore
Page-level restore
Database backup does not block Log
backup
Backup/restore includes FullText data
Copy-only via T-SQL only
High Availability Toolbox
Disaster Recovery
Features
Local limited
disasters
Local disasters
Extended disasters
Failover
Clusterin
g
Database
Mirroring
HA
mode
Database
Mirroring
H P/P
mode
With geoclusters
Limited
With geoclusters
Rapid failover
Failure detection
Automatic
switchover
Specialized
hardware
Secondary workload
on standby
(reporting)
Meta data support
User DB only
User DB only
None
Minimal ~
low
Minimal
Performance impact
With DB
snapshot
With DB
snapshot
Transaction
Log
al
Shipping
Replication
Can automate
in application
Can automate
in application
Backup /
Restore
With
limitations
With
limitations
Some
User DB
only
User DB
only
Low
Low
Low
Agenda
What is High Availability
SQL Server 2005 HA technologies
Hot Standby
Failover Clustering
Database Mirroring High Availability mode
Warm Standby
Database Mirroring High Protection / Performance mode
Replication
Log Shipping
Database Snapshot
Cold Standby
Backup/restore
Online operations
Solutions to common scenarios
Case study
Summary
Online Operations
Backup/restore
Full online backup
Online piecemeal restore; undamaged data
remains available
Indexing
Allows create, drop and alter while users
continue to access data
LOB datatype indexes not supported for online
Memory allocations
CPU affinity settings
Database snapshots
Database Snapshot
Not originally designed as a specific HA
solution but works great in some situations
Turning a Database Mirroring mirror into a
reporting server
Isolated historical data for report
generation
Protection in case of administrative,
developer or user error; classic Oops!
scenario
Uses copy-on-write technique to reduce
disk space consumption
Database Snapshot
How does it work?
CREATE DATABASE mydbSnap AS SNAPSHOT OF mydb
USE mydb
UPDATE (pages 4, 9, 10)
mydb Database
Page
1
10
11
12
13
14
15
mydbSnap Read-Only Database Snapshot
USE mydbSnap
SELECT (pages 4, 6, 9, 10, 14)
16
Agenda
What is High Availability
SQL Server 2005 HA technologies
Solutions to common scenarios
Case study
Summary
Rolling Upgrades
In three steps
1. Perform upgrades on the mirror,
secondary,
or subscriber
2. Switch roles
1.
Backup principal log with no-recovery
3.
Re-direct clients to secondary
2. Recover secondary
Database Mirroring
Failover to the mirror
Log Shipping
Replication
Redirect clients to subscriber
3. Perform upgrades on the original database
server
Optional: Switch roles again
Site Disaster Protection
Example Scenarios
Earthquake, fire, or flood causes datacenter
outage
Solutions
Database Mirroring to a secondary site
Optimized solution - Allows very fast failover times to
the secondary site
Optionally add log shipping for additional site
protection
Log Shipping to one or more secondary sites
Basic solution requires additional effort for failover
Third-party geo-clustering solutions for data
center storage level redundancy
Find SQL Server Always On reviewed solutions at the
Database Mirroring
Configuration
In
three steps
Step 1: Restore
database copy to
mirror site with
no-recovery option
Step 2:
Configure
communication
endpoints
Step 3: Set the
data protection
level and Start
Mirroring
Database Query Workload Scale
Out With Redundancy
Scenario
Need for near real time reporting on a second
server that can also be used for disaster
recovery
Need for a tier of identical databases for scaling
out application queries with ability to use any
one of the database copies for disaster recovery
Solutions
Transactional Replication
Peer-to-Peer Replication
Putting It All Together
Failover Clustering
Replication
Local server
redundancy
Full server/instance
protection
Database
Scale Out
For Queries
Database Mirroring
Primary disaster site
for databases
Reporting with
Snapshot
Failover
Log Shipping
Additional disaster
sites for databases
Logical recovery
(with delay)
Replication
Database reporting
and read scale out
with redundancy
Clustering
Production
Database
Database Mirroring
Hot
Standby
Log Shipping
Warm
Standby
Log Shipping
With Restore Delay
Logical
Recovery
Standby
Agenda
What is High Availability
SQL Server 2005 HA technologies
Solutions to common scenarios
Summary
Summary
<<WiP>>
Resources
www.microsoft.com/sql/
msdn.microsoft.com/sqlserver/
www.microsoft.com/technet/
www.scalabilityexperts.com
www.sqldev.net
www.sqlservercentral.com/
High Availability Toolbox
Disaster Recovery
Features
Local limited
disasters
Local disasters
Extended disasters
Failover
Clusterin
g
Database
Mirroring
HA
mode
Database
Mirroring
H P/P
mode
With geoclusters
Limited
With geoclusters
Rapid failover
Failure detection
Automatic
switchover
Specialized
hardware
Secondary workload
on standby
(reporting)
Meta data support
User DB only
User DB only
None
Minimal ~
low
Minimal
Performance impact
With DB
snapshot
With DB
snapshot
Transaction
Log
al
Shipping
Replication
Can automate
in application
Can automate
in application
Backup /
Restore
With
limitations
With
limitations
Some
User DB
only
User DB
only
Low
Low
Low