A SAP System failure occurs when a component or service fails to perform its specified task at the appropriate time. Here we look at the following kinds of SAP system failure:
This section discusses what constitutes failure of the SAP system in general terms:
· Standard failures
· Basic failure classification
· Single points of failure (SPOFs)
The following factors leading to failure are common to all services:
Hardware includes central processing unit (CPU), memory, network interface card (NIC), and so on. The different kinds of service might reside on physically different hardware, so the failure of a single machine can affect one or more SAP service(s). This is a common cause of failure.
· Operating system services
SAP services depend in turn on operating system services. If operating system services fail, then so does the SAP service. An example of an operating system service is the socket layer services, the failure of which affects the SAP message service.
As with any software, programming errors in software applications can lead to failure of an SAP service.
The following graphic shows the categories in the SAP system for classifying failures:
Basic Failure Classification
When thinking about fault-tolerance, you can look at the SAP system in the following ways:
Here we divide the system into layers with their associated components, using the categories shown in the above graphic.
This section discusses failure of the SAP system services in detail:
¡ How to detect failure
¡ The effects of failure
¡ How to recover from failure
To help you better understand SAP system service failure, see SAP System Service Communications.
The database, enqueue, and message services in a standard SAP system cannot be made redundant by configuring multiple instances of them on different host machines: this means that they are single points of failure (SPOFs). The remaining services (that is, dialog, update, background, gateway, and spool) can all be configured redundantly (in other words, on multiple host machines) to provide improved availability.
In a high availability SAP system, you can protect vulnerable services, such as the enqueue, message, and database services by using, for example, cluster environments with switchover solutions. For more information, see:
In an SAP installation, Network File System (NFS) (for UNIX-based application hosts) and shares (for Microsoft NT-based applications hosts) are SPOFs. Some installations use an Internet Domain Name Service (DNS). DNS is also a single point of failure.
Finally, see SAP System Failure Recovery for details of how SAP systems recover following failure:
· Automatic recovery of SAP processes
· Logon load balancing (prevents users logging on to a dialog host that has failed)
· HTTP load balancing with the SAP Web dispatcher