Note 391 - Archiver stuck

Summary
Symptom
All accesses to the database hang. On the SAP side, only the hourglass is displayed in the entire system. An analysis of the Oracle alert logs shows at least one of the following errors:

ORACLE Instance <sid> - Can not allocate log, archival required
Thread 1 cannot allocate new log, sequence <seq_no>
All online logs needed archiving
ARCH: Archival stopped, error occurred. Will continue retrying
ARC0: Archiving not possible: No primary destinations
ARC0: Failed to archive log <log> thread <thread> sequence <seq_nr>
ARC0: Error 19504 Creating archive log file to <log>
ORA-00257: archiver error. Connect internal only, until freed.
ORA-00255: error archiving log <log> of thread <thread>,
          sequence # <seq_nr>
ORA-00270: error creating archive log <log>
ORA-00272: error writing archive log <log>
ORA-16014: log sequence# not archived, no available destination
ORA-16038: log <log> sequence# <seq_nr> cannot be archived
ORA-19502: write error on file "<log>"
          blockno <block> (blocksize=512)
ORA-19504: failed to create file "<log>"
ORA-19510: failed to set size of  blocks for file "" (blocksize=)
ORA-27044: unable to write the header block of file
ORA-27063: skgfospo: number of bytes read/written is incorrect
ORA-27072: skgfdisp: I/O error
OSD-04008: WriteFile() failure, unable to write to file

An operating system error may also be added to the entries, for example:

<unix> Error: 28: No space left on device
O/S-Error: (OS 112) There is not enough space on the disk.

In addition, information messages may exist such as:

ORA-00312: online log thread 1: '<log>'
ORA-00334: archive log: <offline_log>

When connecting to the database, the following may be displayed:

ORA-03113: end-of-file on communication channel

The following wait events occur:

log file switch (archiving needed)
latch free (on "archive control" latch)
Other terms
Performance hangs, hourglass, update
Reason and Prerequisites
The Oracle Archiver saves the online redo logs of the database as offline redo logs in the SAPARCH or ORAARCH directory. From there, the offline redo logs can be saved to tape using BRARCHIVE. If the archive directory is full on the operating system (or if other problems occur), the online redo logs can no longer be saved properly. To avoid data loss, Oracle does not allow the overwriting of an online redo log that has not been archived. Instead, the database is stopped until the archiving process can be restarted successfully. The following causes may be responsible for the archiver stuck situation:
    1. The SAPARCH directory or file system is full.
    2. No authorization for ora<sid> to write to the SAPARCH or ORAARCH directory.
    3. Archiving not started correctly
    4. Unusually high number of redo logs
    5. Problems accessing secondary archive destinations
    6. Oracle bug
    7. Corruption in redo log
    8. Other causes
Solution
    1. Free space in the directory of the offline redo logs. The directory is defined in the in the Oracle parameter LOG_ARCHIVE_DEST or LOG_ARCHIVE_DEST_1. It is usually /oracle/<sid>/saparch or /oracle/<sid>/oraarch. (to do this, start BRARCHIVE or delete unrequired, large dummy files, for example).
              If there is enough space available in another directory, the archive destination may be temporarily changed using one of the following commands (depending on the database installed):

ALTER SYSTEM ARCHIVE LOG STOP;
If you use LOG_ARCHIVE_DEST_1:
ALTER SYSTEM SET LOG_ARCHIVE_DEST_1 = 'LOCATION=<temp_path>';
If you use LOG_ARCHIVE_DEST:
ALTER SYSTEM SET LOG_ARCHIVE_DEST=<temp_path>;

ALTER SYSTEM ARCHIVE LOG START;
              After freeing space in the standard directory, you can copy the offline redo logs that were written to the temporary directory back.
              We recommend that you use BRARCHIVE 7. 00 or higher to make sure that BRARCHIVE archives all offline redo logs without gaps if the archive destination is changed temporarily. As of this patch, BRARCHIVE can find offline redo logs in different directories (based on V$ARCHIVED_LOG). Older versions only save the redo logs from the currently specified archive destination. As a result, gaps may occur, which can only be closed manually. Make sure that the redo logs in the temporary archive destination are saved correctly before you delete this directory.
    2. Check the write authorization of the user ora<sid> for the directory with the offline redo logs (SAPARCH or ORAARCH). On UNIX, for example, ora<sid> must be able to create a dummy file abc in the SAPARCH directory using "touch abc". If this process fails, ora<sid> does not have write authorization. In this case, change the access authorization for this directory.
    3. In the process list, search for processes named ora_arch_<sid> or ora_arc_<sid>. If the process does not exist, it can be started in SQLPLUS using

    ALTER SYSTEM ARCHIVE LOG START
              
              In addition, check with Oracle < = 9i
whether the parameter LOG_ARCHIVE_START is set to TRUE correctly in the Oracle profile. This parameter activates both the ora_arch_<sid> process and automatic archiving.
    1. If the "archiver stuck" situation was caused by an unusually high number of redo logs, refer to Note 584548.
    2. If, in addition to the standard archive destination, additional destinations are defined with MANDATORY (LOG_ARCHIVE_DUPLEX_DEST, LOG_ARCHIVE_DEST_2, LOG_ARCHIVE_DEST_3, ...), problems that occur when you access these destinations may also cause the archiver to stick. This may be caused by completely different problems (for example, network problems that prevent a connection to a remote destination). In this case, check the alert log for error messages.
    3. If the "archiver stuck " situation does not resolve itself even though there is space again in the archive directory, this may be due to the Oracle bug described in Note 932251.
    4. If a redo log is corrupt, the Archiver cannot save it.
    5.     In this case, Oracle deactivates the archiving, for example:

    ORA-16038: log 3 sequence# 27168 cannot be archived
    ORA-00354: corrupt redo log block header
    ORA-00312: online log 3 thread 1: '/oracle/<sid>/origlogA/log_g13m1.dbf'
    ORA-00312: online log 3 thread 1: '/oracle/<sid>/mirrlogA/log_g13m2.dbf'
    Mon Oct 24 11:27:40 2011
    ARCH: Archival stopped, error occurred. Will continue retrying
    Mon Oct 24 11:27:40 2011
    ORACLE Instance <sid> - Archival Error
              In this case, refer to SAP Note 694155 and, if necessary, deactivate the archiving of the corrupt redo log by using CLEAR UNARCHIVED LOGFILE. Since the missing redo log makes a recovery impossible, you must create a good backup immediately afterwards.
    6. In individual cases, other problems that occur when the system writes offline redo logs are responsible for the "archiver stuck" situation. In the worst case scenario, a failed control file enhancement within the archive history may cause the archiver to stick (Note 904490). You should therefore always check the alert log for other error messages that will give you more detailed clues as to the cause of the "archiver stuck" situation.
Header Data
Release Status:Released for Customer
Released on:04.01.2012  08:02:58
Master Language:German
Priority:Recommendations/additional info
Category:Consulting
Primary Component:BC-DB-ORA Oracle
Secondary Components:BC-DB-ORA-DBA Database Administration with Oracle
Affected Releases
Release-Independent
Related Notes

 
932251 - ARCH no longer archives after full file system
 
904490 - Automatic enhancement of control files
 
863417 - FAQ: Database Archive modes and redo logs
 
767414 - FAQ: Oracle latches
 
694155 - Error due to corrupt redo logs
 
619188 - FAQ: Oracle wait events
 
584548 - Unusually high number of redo logs
 
546006 - Problems with Oracle due to operating system errors
 
541538 - FAQ: Reorganization
 
521264 - Hang situations

No comments: