Show TOC

Process documentationGeneral Checklist Locate this document in the navigation structure

 

This checklist covers questions from Important Questions About Your Setup, with cross-references to other sections in this documentation. A major problem is to define reasonable thresholds to trigger specific recommendations because there are so many complex and interrelated dependencies to consider. Therefore, many of the formulations are general and need to be adapted to your particular installation.

Note Note

Risk analysis with checklists

Before you start to implement high availability into your systems, we strongly advise you to try and quantify which of the items listed in this general checklist and in the potential single point of failure (SPOF) checklist are relevant for you. First you need to rank the vulnerable aspects of your system according to the costs of failure (for example, whether you need to call in an external engineer, order a replacement part, and so on). Then you can start improving availability.

End of the note.

Process

  1. You consider how much system uptime your business requires:

  2. You consider how much system downtime your business can tolerate until there is only a minor business effect:

    • Several hours or one business day

      • Most likely no special measures necessary for protection against hardware or operating system failure

      • Standard recovery procedures for databases are most likely sufficient

    • Less than above

  3. You consider how much system downtime your business can tolerate until there is a major business effect (for example, loss of business):

    If your business can only tolerate a few hours downtime:

    • You must make sure that a restore or recovery can be completed in the time available. For more information, see (listed in alphabetical order):

      Recovery with DB2 for i

      Backup, Recovery, and Upgrade with DB2 for LUW

      Recovery with DB2 for z/OS

      Recovery with MS SQL Server

      Recovery with Oracle

      Recovery with SAP MaxDB

    • If this cannot be guaranteed, your disk technology has to be looked at (see next point).

    • Redundancy for hardware components becomes important, so evaluate the use of:

      • Special disk technology (for example, consider disk mirroring, RAID, or LVM). Disks are among the most vulnerable of all hardware components so it makes sense to start with them.

      • Redundant network components.

      • Cluster CPUs with switchover solutions to protect the database server and/or the central application server. See Switchover Software for High Availability.

      • Uninterruptible Power Supply (UPS) is cheap and worth considering.

    • If your database is Oracle or DB2 for z/OS, an alternative to switchover software is to use Replicated Database Servers together with the DB reconnect feature. For more information on DB reconnect, see DB Reconnect (AS ABAP) and DB Reconnect (AS Java).

    • More than one node should be available as application server. You also need to prepare at least two nodes to act as central application server after a manual reconfiguration. This means that, if the node where the central application server is running becomes unavailable, another node should be prepared to start the central application server.

  4. You consider how much system downtime your business can tolerate until it faces collapse:

    • Evaluate your system for potential single points of failure. Even if the system is 90% equipped for high availability (for example, mirrored disks, redundant network components, and so on), this is worthless if one of the components in the unprotected remaining 10% fails.

    • If replacing critical hardware components or reconfiguring critical software components is likely to take so much time that your entire business is at risk, you need to seriously consider Disaster Recovery using a backup site.

  5. You consider whether your business has periods with special availability requirements.

    If there are critical periods of an application that cannot be interrupted at all or where an interruption can be tolerated for less than a couple of minutes only, you might need to take extra precautions. Consider the following:

  6. You consider factors concerning your installation:

    • Age of system

      The approach you need to take depends on whether a new hardware system is being installed with the SAP system or whether it is to be installed with existing hardware:

      • If a new system is being set up, you should evaluate the high availability requirements at an early stage, and design the new system accordingly.

      • If the SAP system is being installed on an existing system, you need to investigate the system for weak points.

    • Evaluate installations options

      You evaluate the installation options for application servers and the database server. Depending on the desired installation and high availability requirements, you need to carefully consider the mapping of SAP system services since this might not be a straightforward task.

    • Expected Data Volume

      The database setup needs to be carefully planned:

      For more information about backup and recovery of databases, see Database High Availability.

    • Expected Transaction Load

      The transaction load influences the installation options for application servers and the database servers.

  7. You consider your internal resources.

    • Budget available to finance improvements.

      Most companies need to make improvements in the following order, according to financial constraints

      • Disk technology

      • Uninterruptible Power Supply (UPS)

      • Switchover Software for database and central application hosts

    • Availability of qualified personnel

      The level of qualified personnel available to monitor the system during “normal operation” hours might influence the level of redundancy you choose:

      • Qualified personnel not always available

        Certain technologies, such as disk technology, switchover solutions, and redundant network components, reduce the need to have personnel available to handle errors.

      • Qualified personnel always available

        You can rely on your personnel to handle errors.

    Be sure to have the appropriate processes in place to get your staff involved in the event of an error (for example, hotline, on-call, standby).

  8. You consider your external resources.

    • Support contracts

      The level of support contracts in place might influence your approach to high availability:

      • “Special” maintenance contracts in place

        Do you have such contracts with hardware and software vendors such as a guaranteed replacement of faulty hardware components within 24 hours? If so, you might choose not to have a disaster recovery site or to implement less comprehensive redundant hardware, such as disk technology, networks, and so on.

      • “Standard” maintenance contracts in place

        If your maintenance contracts are only standard, you might choose to have a higher level of availability to cover gaps in maintenance. Then you might choose to set up a disaster recovery site and to implement more comprehensive redundant hardware, such as disk technology, networks, and so on.

    • Access to your system for remote support and maintenance

      Before implementation, consider using the GoingLive service. For proactive and highly qualified SAP system administration, you can consider using the EarlyWatch service to avoid problems arising. SAP provides both services.

      For more information about GoingLive and EarlyWatch, see SAP Safeguarding.

  9. You consider environmental or other factors.

    Examples of these factors are:

    • Unstable power supply in your area

      Consider using Uninterruptible Power Supply (UPS) or redundant power suppliers.

    • Likelihood of disaster such as earthquake

      Consider using a disaster recovery site.

    • High temperature

      Consider using a reliable air-conditioning facility.

    • Switch from summer to winter time

      Consider using the DST safe kernel