Understanding LGWR, Log File Sync Waits and Commit Performance
Tanel Pder http://blog.tanelpoder.com http://tech.e2sn.com
Enterprise 2.0 Service Networks
2009-2010 www.e2sn.com
WWW.E2SN.COM
Intro: About me
Tanel Pder
http://tech.e2sn.com - My company and technical Oracle stuff http://blog.tanelpoder.com - Personal Blog (more tech stuff) tanel@tanelpoder.com - Questions & enquiries
Consulting, Training, Seminars tanel@tanelpoder.com http://tech.e2sn.com/oracle-training-seminars
Online seminars coming soon!
Enterprise 2.0 Service Networks
2009-2010 www.e2sn.com
WWW.E2SN.COM
Topics
How does commit & log file sync work
Overview
Reasons for too long log file sync waits
How to measure wheres the problem?
Enterprise 2.0 Service Networks
2009-2010 www.e2sn.com
WWW.E2SN.COM
Reasons for log file sync waits
Commits wait for log file sync by default
User commits
Theres an user commits statistic in v$sesstat
DDL
Resulting recursive transactions commit
Recursive data dictionary DML
Rollbacks wait too!
User rollbacks
User/application issued a rollback command
Transaction rollbacks
We had an internal rollback (because of some failure) Space allocation/ASSM problems, cancelled queries, killed sessions
Enterprise 2.0 Service Networks
2009-2010 www.e2sn.com
WWW.E2SN.COM
Commit & log file sync flow idealistic overview
Time
1) User issues a COMMIT 2) Foreground proc posts LGWR 3) LGWR issues the physical write syscall 5) LGWR posts the FG proc.
6) Commit complete
FG proc.
LGWR IO
4) The physcical write syscall completes
log file parallel write log file sync
Enterprise 2.0 Service Networks 2009-2010 www.e2sn.com
WWW.E2SN.COM
Log file sync performance -> disk IO speed
Time
FG proc.
LGWR IO
The physical write IO (log file parallel write) takes most of the time
Most of the log file sync time was spent waiting on log file parallel write The other components, scheduling latency, IPC were small
WWW.E2SN.COM
Enterprise 2.0 Service Networks
2009-2010 www.e2sn.com
Log file sync performance -> scheduling latency
Time
1) User issues a COMMIT
FG proc.
2) LGWR waits in CPU runqueue
6) Foreground proc gets posted, gets onto CPU runqueue
LGWR IO
3) LGWR submits the IO, goes to sleep
4) IO completes, OS puts LGWR to CPU runqueue
WWW.E2SN.COM
5) LGWR gets onto CPU, posts foreground proc.
Enterprise 2.0 Service Networks
2009-2010 www.e2sn.com
Log file sync flow
1. Foreground process (FG) posts LGWR and goes to sleep
The log file sync wait starts Posting is done via a semaphore operation on Unix/Linux
2. LGWR wakes up, gets onto CPU
Issues the IO request(s) LGWR goes to sleep, waiting for log file parallel write wait
3. Hardware completes the IO and OS wakes up LGWR
LGWR gets onto CPU Marks log file parallel write event complete and posts the FG
4. Foreground process is woken up by LGWR post
Foreground process gets onto CPU and completes the log file sync wait
Enterprise 2.0 Service Networks 2009-2010 www.e2sn.com
WWW.E2SN.COM
Measuring LGWR "speed"
SQL> @snapper out 1 10 1096 -- Session Snapper v2.01 by Tanel Poder ( http://www.tanelpoder.com ) --------------------------------------------------------------------------------SID, USERNAME , TYPE, STATISTIC , DELTA, HDELTA/SEC, %TIME, GRAPH --------------------------------------------------------------------------------1096, (LGWR) , STAT, messages sent , 12 , 12, 1096, (LGWR) , STAT, messages received , 10 , 10, 1096, (LGWR) , STAT, background timeouts , 1 , 1, 1096, (LGWR) , STAT, physical write total IO requests , 40, 40, 1096, (LGWR) , STAT, physical write total multi block request, 38, 38, 1096, (LGWR) , STAT, physical write total bytes, 2884608 , 2.88M, 1096, (LGWR) , STAT, calls to kcmgcs , 20 , 20, 1096, (LGWR) , STAT, redo wastage , 4548 , 4.55k, 1096, (LGWR) , STAT, redo writes , 10 , 10, 1096, (LGWR) , STAT, redo blocks written , 2817 , 2.82k, 1096, (LGWR) , STAT, redo write time , 25 , 25, 1096, (LGWR) , WAIT, LGWR wait on LNS , 1040575 , 1.04s, 104.1%, |@@@@@@@@@@| 1096, (LGWR) , WAIT, log file parallel write , 273837 , 273.84ms, 27.4%,|@@@ | 1096, (LGWR) , WAIT, events in waitclass Other , 1035172 , 1.04s , 103.5%,|@@@@@@@@@@| -- End of snap 1, end=2010-03-23 12:46:04, seconds=1
Enterprise 2.0 Service Networks
2009-2010 www.e2sn.com
WWW.E2SN.COM
LGWR and Asynch IO
oracle@linux01:~$ strace -cp `pgrep -f lgwr` Process 12457 attached - interrupt to quit ^CProcess 12457 detached % time seconds usecs/call calls errors This is what the log ------ ----------- ----------- --------- --------file263 parallel write wait 100.00 0.010000 38 3 0.00 0.000000 213 event0measures AIO 0.00 0.000000 0 8 reaping duration 0.00 0.000000 0 701 0.00 0.000000 0 41 0.00 0.000000 0 41 This system call is not 0.00 0.000000 0 2 0.00 0.000000 0 37 instrumented by wait ------ ----------- ----------- --------- --------Interface! 100.00 0.010000 1081 3
syscall -------------semtimedop times getrusage gettimeofday io_getevents io_submit semop semctl -------------total
Enterprise 2.0 Service Networks
2009-2010 www.e2sn.com
WWW.E2SN.COM
10
Warning
The (background processes) IO instrumentation has quite a few bugs
Different IO modes, sync, asynd, direct, buffered etc
On some versions, the log file parallel write (and db file parallel write) arent properly instrumented
Version dependent, for example 11.2.0.1 on Linux When filesystemio_options = NONE, IO syscall waits are instrumented ok (but you dont wan to use this option) When filesystem_io_options = ASYNC, IO reaping waits are all very short
However theres unaccounted time in LGWRs wait profile
When filesystem_io_options = SETALL, IO reaping waits are instrumented properly
Enterprise 2.0 Service Networks 2009-2010 www.e2sn.com
WWW.E2SN.COM
11
Redo, commit related latches and tuning
Redo related latches
redo allocation latches
Protect allocating space in log buffer / RBA ranges in redolog stream
redo copy latches
Used only for keeping track of whether anyones copying data into redo log buffer so that LGWR would know to wait for these memory copies to complete before it tries to write buffers to disk LGWR will wait for LGWR wait for redo copy wait event in such cases Used to be tuned by _log_simultaneous_copies
Should we tune any of these?
No, we should fix only problems which exist In other words, if wait interface doesnt show anyone waiting for them, then dont bother tuning them!
WWW.E2SN.COM
Enterprise 2.0 Service Networks
2009-2010 www.e2sn.com
12
Instrumentation
Wait Events:
log file sync log file parallel write log file single write
Performance Counters (V$SESSTAT, V$SYSSTAT)
redo size redo writing time user commits user rollbacks transaction rollbacks
Enterprise 2.0 Service Networks
2009-2010 www.e2sn.com
WWW.E2SN.COM
13
Wait event: log buffer space
(Not a commit problem) LGWR is too slow flushing redo log buffer contents to disk
Either because too slow IO subsystem Or LGWR not getting enough (quality) CPU time Sometimes pops up due large (unplanned) transactions
Of course, it can also be because of a too small log buffer
Which is not the case anymore in modern days Log buffer is usually multiple MB due how it is allocated from SGA You shouldnt even set the log_buffer parameter in 10g+
Enterprise 2.0 Service Networks
2009-2010 www.e2sn.com
WWW.E2SN.COM
14
Wait event: log file single write
Single block redo IO is used mostly for logfile header block reading/writing
Log switch is the main cause Archiving as well as it updates log header Who wait: LGWR & ARCH
Example of what LGWR does during a log switch:
WAIT #0: nam='log file sequential read' ela= 12607 log#=0 block#=1 WAIT #0: nam='log file sequential read' ela= 21225 log#=1 block#=1 WAIT #0: nam='control file sequential read' ela= 358 file#=0 WAIT #0: nam='log file single write' ela= 470 log#=0 block#=1 WAIT #0: nam='log file single write' ela= 227 log#=1 block#=1
Enterprise 2.0 Service Networks
2009-2010 www.e2sn.com
WWW.E2SN.COM
15
LGWR trace warnings
Starting from 10.2.0.3 (or was it 10.2.0.4) LGWR is trying to be helpful and dump warnings when the actual log write IO takes too long:
New parameter: _side_channel_batch_timeout_ms timeout before shipping out the batched side channelmessages in milliseconds
LGWR trace file:
*** 2010-03-10 11:36:06.759 Warning: log write time 690ms, size 19KB *** 2010-03-10 11:37:23.778 Warning: log write time 52710ms, size 0KB *** 2010-03-10 11:37:27.302 Warning: log write time 3520ms, size 144KB
Enterprise 2.0 Service Networks 2009-2010 www.e2sn.com
WWW.E2SN.COM
16
Log file sync in Statspack / AWR
How much of the end-to-end response time goes to log file sync?
How big it is compared to the full response time of the end user? Log file sync may take 20% of your DB Time but the DB Time itself may take only 10% of the total end user response time!
Enterprise 2.0 Service Networks
2009-2010 www.e2sn.com
WWW.E2SN.COM
17
Log file sync in Statspack / AWR
If log file sync waits take a significant part of the response time, look into the Avg wait (ms) column:
Top 5 Timed Events Avg %Total ~~~~~~~~~~~~~~~~~~ wait Call Event Waits Time (s) (ms) Time ------------------------------------- ------------ ----------- ------ -----PL/SQL lock timer 57,159 8,754 153 68.8 db file sequential read 61,258 1,562 25 12.3 log file sync 5,873 1,522 259 12.0 CPU time 463 3.6 direct path write 235,606 155 1 1.2 ............ %Tim Total Wait out Time (s) ---- ---------0 108 0 39 0 4 Avg %Total wait Waits Call (ms) /txn Time ------ -------- -----8 0.9 .8 11 0.2 .3 16 0.0 .0
Event Waits ------------------------ -----------log file parallel write 13,292 db file parallel write 3,470 db file sequential read 257
Enterprise 2.0 Service Networks
2009-2010 www.e2sn.com
WWW.E2SN.COM
18
Better breakdown of wait times V$EVENT_HISTOGRAM
Instead of a single wait time average, breaks wait times into buckets
SQL> select event, wait_time_milli,wait_count 2 from v$event_histogram 3 where event = 'log file parallel write'; EVENT WAIT_TIME_MILLI WAIT_COUNT ------------------------- --------------- ---------log file parallel write 1 22677 log file parallel write 2 424 log file parallel write 4 141 log file parallel write 8 340 log file parallel write 16 1401 log file parallel write 32 812 log file parallel write 64 391 log file parallel write 128 21 log file parallel write 256 6
WWW.E2SN.COM
Enterprise 2.0 Service Networks
2009-2010 www.e2sn.com
19
Identifying scheduling latency
Easy on Solaris prstat m
Works since Solaris 8
Timed_os_statistics
OS Wait-cpu (latency) time in v$sesstat
On other OSes, it cant be directly measured with standard tools Indirectly, you can look into system wide average CPU runqueue length and assume that LGWR was also queueing
Not a too systematic approach, huh?
Enterprise 2.0 Service Networks
2009-2010 www.e2sn.com
WWW.E2SN.COM
20
Measure scheduling latency (Solaris) Reading thread-level microstate accounting data with prstat:
USR SYS TRP TFL/DFL LCK SLP LAT % % % % % % % Time spent on CPU in user mode Time spent on CPU in kernel mode Time spent processing system traps (CPU traps) Time spent processing text/data page faults Time spent waiting for user locks Time spent sleeping (other than user locks) CPU scheduling latency
# prstat -mLp 5124 PID USERNAME USR 5124 oracle 4.2 5124 oracle 0.0 5124 oracle 0.0 5124 oracle 0.0 5124 oracle 0.0 5124 oracle 0.0 5124 oracle 0.0 5124 oracle 0.0 5124 oracle 0.0
SYS 1.6 0.6 0.6 0.6 0.6 0.0 0.0 0.0 0.0
TRP 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
TFL 0.6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
DFL 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
LCK 0.0 99 99 99 99 0.0 100 100 100
SLP 94 0.0 0.0 0.7 0.0 100 0.0 0.0 0.0
LAT VCX ICX SCL SIG PROCESS/LWPID 0.1 412 20 3K 0 oracle/1 0.0 44 3 90 0 oracle/11 0.0 44 0 90 0 oracle/7 0.0 46 0 90 0 oracle/9 0.0 44 1 90 0 oracle/5 0.0 24 0 7 0 oracle/2 0.0 1 0 5 0 oracle/3 0.0 1 0 5 0 oracle/10 0.0 1 0 5 0 oracle/8
Enterprise 2.0 Service Networks
2009-2010 www.e2sn.com
WWW.E2SN.COM
21
Bugs, problems
No instrumentation
On some version/platform/IO configuration the wait interface doesnt record log file parallel write waits at all
The same goes for db file parallel writes (Ive noticed it on 9.2-10.2.0.x on Solaris for example)
For LGWR you can use V$SESSTAT redo write time statistic instead
Its in centiseconds
1-second log file sync bug
Most log file syncs took ~1 second to complete The posts sent back by LGWR were missed by foreground process Thus the FG always waited until the 1 second log file sync wait timeout happened
Enterprise 2.0 Service Networks
2009-2010 www.e2sn.com
WWW.E2SN.COM
22
Tuning
No need for tuning!
Log buffer is quite large by default
All memory remaining in a granule after the allocation for fixed SGA is given to log buffer
Oracle used to have a single redo log buffer until v9.0
Redo allocation latch could become the ultimate contention point
Since 9.2, Oracle can have the log buffer split into multiple buffers
Each protected by a separate redo allocation latch
From 10g, Oracle can keep lots of small private redo strands in shared pool
Each protected by a separate redo allocation latch @rs.sql Show redo strands available
WWW.E2SN.COM
Enterprise 2.0 Service Networks
2009-2010 www.e2sn.com
23
Evil tuning
If you dont care about the D in ACID (and want to occasionally lose data for fun), then:
10gR1:
commit_logging transaction commit log write behavior
10gR2:
commit_write commit_wait transaction commit log write behavior transaction commit log wait behavior
Old undocumented stuff
_wait_for_sync wait for sync on commit MUST BE ALWAYS TRUE
Old: Put redologs to /tmp (on Solaris) or in-memory disks (/dev/shm) for duration of a migration/upgrade
If your OS / server crashes, youll need to restore from a backup!
Enterprise 2.0 Service Networks 2009-2010 www.e2sn.com
WWW.E2SN.COM
24
Optimizations for working around bad applications
Commit optimization
In PL/SQL since Oracle 9i The log file sync is deferred until the end of the PL/SQL call!
SQL> exec while true loop update t set a=a+1 ; commit ; end loop;
No log file sync waits log buffer space / log file switch completion waits more likely!
Enterprise 2.0 Service Networks
2009-2010 www.e2sn.com
WWW.E2SN.COM
25
LGWR configuration CPU
Prevent priority decay
Put LGWR into fixed priority scheduling class (FX60 on Solaris) LGWR should get onto CPU faster when waking up LGWR isnt thrown off CPU as likely
If LGWR is still experiencing significant scheduling latency
You can put LGWR into a higher priority class
You should not put LGWR into the highest real-time class
Real time is tricky your process can monopolize a CPU for itself You dont want to make LGWR pre-empt the OS kernel! Note that Oracle sets some processes into higher priority by default:
_high_priority_processes
Enterprise 2.0 Service Networks 2009-2010 www.e2sn.com
LMS*|VKTM 26
WWW.E2SN.COM
LGWR configuration - IO
Reduce the amount of work and waiting a log file parallel write has to do
Unbuffered concurrent IO
Verify with truss/strace whether proper flags are used (O_DIRECT, O_DIO, O_CIO etc)
Or use raw devices
ASM is essentially a raw device
Or ODM for some cases
And optimize the whole IO hardware stack, of course!
Note shat mid-large size storage arrays do have write cache built in So, moving redo log files to SSD may not give any advantage! Verify whats your current log file parallel write latency using v$event_histogram
Enterprise 2.0 Service Networks 2009-2010 www.e2sn.com
WWW.E2SN.COM
27
log file sync magic tuning super-secret!!!
COMMIT LESS!!!
Commit when your business transaction ends, not after every single update!
Enterprise 2.0 Service Networks 2009-2010 www.e2sn.com
WWW.E2SN.COM
28
Summary
Application: Commit less!
Ideally only when your logical business transaction ends
Troubleshooting: Measure log file sync at session level detail CPU: If waits for log file sync are significant - see whether LGWR gets:
Enough (quality) CPU time Onto the CPU fast enough
IO: See how much LGWR waits for log file parallel write event
Whats the log file parallel write completion time V$EVENT_HISTOGRAM for better detail
Enterprise 2.0 Service Networks 2009-2010 www.e2sn.com
WWW.E2SN.COM
29
Thanks!
Download slides from:
http://tech.e2sn.com/oracle-slides-and-whitepapers
Download Snapper from:
http://tech.e2sn.com/oracle-scripts-and-tools/session-snapper
Blog:
http://blog.tanelpoder.com
Email:
tanel@tanelpoder.com
Enterprise 2.0 Service Networks
2009-2010 www.e2sn.com
WWW.E2SN.COM
30