Start of Content Area

Process documentation SPOF Checklist  Locate the document in its SAP Library structure

Purpose

There are a number of single points of failure (SPOFs) in most systems and you should be aware of these before you start to build high availability into your system. However, what constitutes a SPOF depends on your particular system configuration. For example, a disk drive might be a SPOF in a given system configuration but, when mirrored, no longer be a SPOF. The major SPOFs are listed below, grouped into main system areas.

See also the table in “What is System Failure” in SAP System Failures, which lists the components of an SAP system by system level.

Prerequisites

SAP suggests that, for each component of a planned or installed SAP system listed in this process, you assess the following:

        Is the component a SPOF in your particular system configuration?

        Can you afford the risk of failure for a particular SPOF?

Note

Risk analysis using checklists

Before you start to build high availability into the systems at your site, SAP strongly advises you to try and quantify which of the items listed in this SPOF checklist and in the general checklist are relevant for you. Having ranked the vulnerable aspects of your system according to the costs of failure (for example, whether you need to call in an external engineer, order a replacement part, and so on), you are then in the best position to start improving availability.

Process Flow

...

       1.      You consider redundant configuration of the SAP services dialog, update, batch, gateway, and spool – that is, on multiple host machines – to improve availability. This means that these services are not single points of failure.

You can improve the availability of the message service by the use of switchover software.

For more information, see SAP Web AS ABAP: High Availability and SAP Web AS Java: High Availability.

       2.      You consider configuration of the database service to overcome its single points of failure:

        Loss of connection between application service and database service. Use DB Reconnect to overcome this problem.

        Loss of database data. For more information about this problem, see Replicated Databases. This is also discussed for each database manufacturer below.

       3.      You consider the database-specific recommendations in the following table:

Database

Single Points of Failure (SPOFs)

Oracle

         Database Instance

        Database background processes (DBWR, LGWR, SMON, PMON...)

        Memory structures (SGA, semaphores)

You can protect the database instance using Switchover Software or Replicated Database Servers (only available for certain databases).

        Database files

        Control file

        Current online redo log file

        Data files

You can protect the control file and the current online redo log file by using Oracle or proprietary disk mirroring. You can protect the data files by using disk mirroring. You should also protect all files by doing backups.

You can use Oracle Standby Databases for a more comprehensive high availability solution that can withstand a disaster at one site.

Informix

        Database instance

You can protect the database instance using Switchover Software.

        Database data

You can protect all relevant files by using Informix or proprietary disk mirroring. SAP strongly recommends some form of mirroring (preferably Informix) for, at the very least, the “critical” dbspaces (logdbs, physdbs and rootdbs). In any case, you should also perform regular archives and backups.

See also Informix High-Availability Data Replication (HDR).

MySQL MaxDB

        Database instance

You can protect the database instance using Switchover Software.

        Database data

You can protect all relevant devices by using proprietary disk mirroring (RAID 1 preferred) for all data volumes and log volumes. If you want to use log volumes without RAID mirroring consider the possibility of mirroring using log mode. In any case, you should also perform regular backups.

You can use the MaxDB Standby Database for a more comprehensive high availability solution that can withstand a disaster at one site.

IBM DB2 Universal Database for UNIX and Windows

        Database instance

You can protect the database instance using Switchover Software.

        Database data

You should always perform regular backups.

See also Replicated Standby Database for DB2 UDB for UNIX and Windows.

IBM DB2 Universal Database for z/OS

 

        Database instance

You can protect the database instance using Data Sharing for DB2 UDB for z/OS.

        Database data

You can protect the data by performing regular backups and using disk mirroring. You can also use a standby database to protect the data against disaster.

See also Replicated Standby Database for DB2 UDB for z/OS and Data Sharing for DB2 UDB for z/OS.

IBM DB2 Universal Database for iSeries

        Database instance

You can protect the database instance using Switchover Software.

        Database data

You should always perform regular backups.

MS SQL Server

See the following high availability solutions:

        Microsoft Cluster Server on Windows

        Microsoft SQL Server Standby Database

        Comprehensive Microsoft SQL Server High Availability Solution

       4.      You consider network-specific recommendations for:

        Cabling

        Active components (hubs, switches, routers)

        Network Interface Card (NIC)

        SAProuter

        Network File System (NFS) – see “Single Points of Failure” in SAP System Failures

       5.      You consider hardware and system software. For more information, see the table in section “What is System Failure” of SAP System Failures, which lists the components of an SAP system by system level.

       6.      You consider disk technology.

Possible single points of failure in the hardware of a disk system include the following:

        Power supply

        Fan and cooling

        Internal/external cabling

        SCSI path from host machine to device

        Internal system bus

        Write-cache:

         Non-volatile SIMMs or battery backup serve to address power failure

         Mirrored SIMMs to address SIMM failure

        Read-cache:  non-volatile SIMMs optional

        Battery power for the device to store cache to disk in case of power failure

        Controller

        Micro code

        Disk-internal storage processors

        RAID internal storage maps

        Disk spindles

        Spindle mechanism

Possible single points of failure in the disk-based data are the following:

        SAP user data

        SAP system data

        Software components:

         The SAP system

         DBMS and log files

         Operating system and swap space

        Root file system

End of Content Area