Start of Content Area

Object documentation SAP System Failures  Locate the document in its SAP Library structure

Definition

A SAP System failure occurs when a component or service fails to perform its specified task at the appropriate time. Here we look at the following kinds of SAP system failure:

This section discusses what constitutes failure of the SAP system in general terms:

        Standard failures

        Basic failure classification

        Single points of failure (SPOFs)

Standard Failures

The following factors leading to failure are common to all services:

        Hardware

Hardware includes central processing unit (CPU), memory, network interface card (NIC), and so on. The different kinds of service might reside on physically different hardware, so the failure of a single machine can affect one or more SAP service(s). This is a common cause of failure.

        Operating system services

SAP services depend in turn on operating system services. If operating system services fail, then so does the SAP service. An example of an operating system service is the socket layer services, the failure of which affects the SAP message service. 

        Software

As with any software, programming errors in software applications can lead to failure of an SAP service.

Basic Failure Classification

The following graphic shows the categories in the SAP system for classifying failures:

Basic Failure Classification

This graphic is explained in the accompanying text

When thinking about fault-tolerance, you can look at the SAP system in the following ways:

        SAP system component failure

Here we divide the system into layers with their associated components, using the categories shown in the above graphic.

        SAP system service failure

This section discusses failure of the SAP system services in detail:

        How to detect failure

        The effects of failure

        How to recover from failure

To help you better understand SAP system service failure, see SAP System Service Communications.

Single Points of Failure

The database, enqueue, and message services in a standard SAP system cannot be made redundant by configuring multiple instances of them on different host machines: this means that they are single points of failure (SPOFs). The remaining services (that is, dialog, update, background, gateway, and spool) can all be configured redundantly (in other words, on multiple host machines) to provide improved availability.

In a high availability SAP system, you can protect vulnerable services, such as the enqueue, message, and database services by using, for example, cluster environments with switchover solutions. For more information, see:

        Cluster Technology

        Microsoft Cluster Server on Windows

        Switchover Software for High Availability

        Replicated Enqueue Server

Note

In an SAP installation, Network File System (NFS) (for UNIX-based application hosts) and shares (for Microsoft NT-based applications hosts) are SPOFs. Some installations use an Internet Domain Name Service (DNS). DNS is also a single point of failure.

Failure Recovery

Finally, see SAP System Failure Recovery for details of how SAP systems recover following failure:

        Automatic recovery of SAP processes

        Logon load balancing (prevents users logging on to a dialog host that has failed)

        HTTP load balancing with the SAP Web dispatcher

End of Content Area