Log File Sync :
To Find out Storage is the problem?
-> Run AWR Report during the problematic time
-> Check "Wait Avg(ms)" for the wait event "log file sync", if it's less than 1s, we are good with storage,
if we more than that there is some issue with storage, i.e LGWR is slow to write to disk.
-> Check "Wait Avg(ms)" for the wait event "Log file parallel write", if it's less than 20s, we are good with storage,
if we more than 20s that there is some issue with storage subsystem, i.e LGWR is slow to write to disk.
-> Check LGWR Traces
You will notice a warning like below.
*** 2011-10-26 10:14:41.718
Warning: log write elapsed time 21130ms, size 1KB
(set event 10468 level 4 to disable this warning)
-> Check the number of log switches, if you notice the high number, consider increasing the log file size.
To find Issue from application side?
-> Excessive Application Commits
Excession commit can cause performance issue, since it flushes the redo from redo buffer to redo log. You can confirm by looking at the AWR report.
In the AWR or Statspack report,
if the average user calls per commit/rollback calculated as "user calls/(user commits+user rollbacks)" is less than 30, then commits are happening too frequently:
Adaptive Log File Sync
Adaptive Log File sync was introduced in 11.2. The parameter controlling this feature, _use_adaptive_log_file_sync, is set to false by default in 18.104.22.168 and 22.214.171.124.
In 126.96.36.199 the default is now true. When enabled, Oracle can switch between the 2 methods:
Post/wait, traditional method for posting completion of writes to redo log
Polling, a new method where the foreground process checks if the LGWR has completed the write.
Recommendations for storage
1. Work with the system administrator to examine the filesystems where the redologs are located with a view to improving the performance of IO.
2. Do not place redo logfiles on a RAID configuration which requires the calculation of parity, such as RAID-5 or RAID-6.
3. Do not put redo logs on Solid State Disk (SSD).
4. Look for other processes that may be writing to that same location and ensure that the disks have sufficient bandwidth to cope with the required capacity. If they don't then move the activity or the redo.