Thursday, November 27, 2014

Troubleshooting Log File Sync Waits

If you notice a very large number of waits with a short average wait time per wait, that’s an indication that too many commit statements are being issued by the database. You must change the commit behavior by batching the commits.
Instead of committing after each row, for example, you can specify that the commits occur after every 500 rows.

If you notice that the large amount of wait time accumulated due to the redo log file sync event was caused by long waits for writing to the redo log file (high average time waited for this event), it’s more a matter of how fast your I/O subsystem is. You can alternate the redo log files on various disks to reduce contention. You can also see if you can dedicate disks entirely for the redo logs instead of allowing other files on those disks—this will reduce I/O contention when the LGWR is writing the buffers to disk. Finally, as a long-term solution, you can look into placing redo logs on faster devices, say, by moving them from a RAID 5 to a RAID 1 device.

How It Works

Oracle (actually the LGWR background process) automatically flushes a session’s redo information to the redo log file whenever a session issues a COMMIT statement. The database writes commit records to disk before it returns control to the client. The server process thus waits for the completion of the write to the redo log. 


The log file sync wait event includes the wait during the writing of the log buffer to disk by LGWR and the posting of that information to the session. The server process will have to wait until it gets confirmation that the LGWR process has completed writing the log buffer contents out to the redo log file.

The log file sync events are caused by contention during the writing of the log buffer contents to the redo log files. Check the V$SESSION_WAIT view to ascertain whether Oracle is incrementing the SEQ# column. If Oracle is incrementing this column, it means that the LGWR process is the culprit, as it may be stuck.

P1 = buffer#
All changes up to this buffer number (in the log buffer) must be flushed to disk and the writes confirmed to ensure that the transaction is committed and will be kept on an instance crash. The wait is for LGWR to flush up to this 

As the log file sync wait event is caused by contention caused by the LGWR process, see if you can use the NOLOGGING option to get rid of these waits. Of course, in a production system, you can’t use the NOLOGGING option when the database is processing user requests, so this option is of limited use in most cases.

No comments:

Post a Comment