Managing BPM Clusters 
The BPM cluster is responsible for distributing and transporting the process instances between the BPM systems in the SAP NetWeaver Java system. The cluster ensures that all BPM systems know about each other and have the same consistent view of the BPM cluster.
The BPM cluster is self-managed and automatically recovers inconsistencies when a BPM system joins or quits the cluster. If the BPM cluster is not successful in recovering inconsistencies, you have to restart the BPM systems to recover the inconsistencies manually.
Inconsistencies can be caused by the following:
Restart of a BPM system
Restart of a server node (manual restart, out-of-memory)
Database problem
Communication issues between BPM systems
Network outage
Full garbage collection phase
Inconsistencies in the BPM cluster can lead to the following problems:
Cannot start new process instances
The BPM systems cannot distribute new process instances in the BPM cluster. Consequently the start of new process instances fails.
Cannot process tasks
The probability that the server node processing the task hosts the process instance is very low, especially in large cluster. It requires therefore a transport of the process instance which will fail.
Cannot administer processes and tasks
Administrating tasks and processes requires loading the process. If the process is not hosted by the BPM system processing the administrative request, a transport of the process instance will be requested and will fail.
You can detect inconsistencies in the BPM cluster using the following monitors and logs:
Consistency status monitor in the SAP Solution Manager
When a BPM system has an inconsistent view of the BPM cluster for more than 5 minutes, the monitor becomes red and report an inconsistency.
Default trace
When an inconsistency occurs, error log messages with ClusterIntegrityException can be found in the default trace.
Process server monitors
The Cluster Entry Points monitor displays the view of the BPM cluster for the local BPM system. By comparing the monitors of the BPM systems, you can easily detect inconsistencies.
Recommendation
We recommend to use the consistency status monitor in the SAP Solution Manager to detect inconsistencies.
Note
Not all detected inconsistencies in the BPM cluster are critical. Usually, the BPM cluster recovers them within a few minutes. Only if this is not the case, you need to intervene.
The following recovery procedure allows to manually recover the inconsistencies without disruptions as at least on server node will continue to run:
Stop all server nodes where the service com.sap.glx.core.svc is stopped.
Ensure that only one master BPM system exists:
If at least two master BPM systems exists, stop the corresponding server nodes except of one.
If no master BPM system exists, shutdown one server node and wait one to two minutes. If still no master BPM system exists, safely restart the entire SAP NetWeaver Java system.
Shutdown all the server nodes hosting the BPM systems, which are not visible to the master BPM system.
Ensure that the BPM cluster is consistent again.
Start the stopped server nodes one by one and check whether the BPM cluster remains consistent.
If the inconsistencies remain, safely restart the whole SAP NetWeaver Java system. For more information on how to start the SAP NetWeaver Java system, see Starting the BPM System.