Running Heartbeat clusters in CRM configuration mode is the recommended approach as of Heartbeat release 2 (per the Linux-HA development team).
Cluster configuration is distributed cluster-wide and automatically, by the Cluster Resource Manager. It need not be propagated manually.
CRM mode supports both node-level and resource-level monitoring, and configurable responses to both node and resource failure. It is still advisable to also monitor cluster resources using an external monitoring system.
CRM clusters support any number of resource groups, as opposed to Heartbeat R1-style clusters which only support two.
CRM clusters support a powerful (if complex) constraints framework. This enables you to ensure correct resource startup and shutdown order, resource co-location (forcing resources to always run on the same physical node), and to set preferred nodes for particular resources.
Another advantage, namely the fact that CRM clusters support up to 255 nodes in a single cluster, is somewhat irrelevant for setups involving DRBD (DRBD itself being limited to two nodes).
Heartbeat CRM clusters are comparatively complex to configure and administer;
Extending Heartbeat's functionality with custom OCF resource agents is non-trivial.
This disadvantage is somewhat mitigated by the fact that you do have the option of using custom (or legacy) R1-style resource agents in CRM clusters.
In CRM clusters, Heartbeat keeps part of configuration in the following configuration files:
/etc/ha.d/ha.cf, as described in the section called “The
ha.cf file”. You must include the following line in this configuration file to enable CRM mode:
/etc/ha.d/authkeys. The contents of this file are the same as for R1 style clusters. See the section called “The
authkeys file” for details.
The remainder of the cluster configuration is maintained in the Cluster Information Base (CIB), covered in detail in the following section. Contrary to the two relevant configuration files, the CIB need not be manually distributed among cluster nodes; the Heartbeat services take care of that automatically.
The Cluster Information Base (CIB) is kept in one XML file,
/var/lib/heartbeat/crm/cib.xml. It is, however, not recommended to edit the contents of this file directly, except in the case of creating a new cluster configuration from scratch. Instead, Heartbeat comes with both command-line applications and a GUI to modify the CIB.
The CIB actually contains both the cluster configuration (which is persistent and is kept in the
cib.xml file), and information about the current cluster status (which is volatile). Status information, too, may be queried either using Heartbeat command-line tools, and the Heartbeat GUI.
After creating a new Heartbeat CRM cluster — that is, creating the
authkeys files, distributing them among cluster nodes, starting Heartbeat services, and waiting for nodes to establish intra-cluster communications — a new, empty CIB is created automatically. Its contents will be similar to this:
<cib> <configuration> <crm_config> <cluster_property_set id="cib-bootstrap-options"> <attributes/> </cluster_property_set> </crm_config> <nodes> <node uname="alice" type="normal" id="f11899c3-ed6e-4e63-abae-b9af90c62283"/> <node uname="bob" type="normal" id="663bae4d-44a0-407f-ac14-389150407159"/> </nodes> <resources/> <constraints/> </configuration> </cib>
The exact format and contents of this file are documented at length on
the Linux-HA web site, but for practical purposes it is important to understand that this cluster has two nodes named
bob, and that neither any resources nor any resource constraints have been configured at this point.
This section explains how to enable a DRBD-backed service in a Heartbeat CRM cluster. The examples used in this section mimic, in functionality, those described in the section called “Heartbeat resources”, dealing with R1-style Heartbeat clusters.
The complexity of the configuration steps described in this section may seem overwhelming to some, particularly those having previously dealt only with R1-style Heartbeat configurations. While the configuration of Heartbeat CRM clusters is indeed complex (and sometimes not very user-friendly), the CRM's advantages may outweigh those of R1-style clusters. Which approach to follow is entirely up to the administrator's discretion.
Even though you are using Heartbeat in CRM mode, you may still utilize R1-compatible resource agents such as
drbddisk. This resource agent provides no secondary node monitoring, and ensures only resource promotion and demotion.
<group ordered="true" collocated="true" id="rg_mysql"> <primitive class="heartbeat" type="drbddisk" provider="heartbeat" id="drbddisk_mysql"> <meta_attributes> <attributes> <nvpair name="target_role" value="started"/> </attributes> </meta_attributes> <instance_attributes> <attributes> <nvpair name="1" value="mysql"/> </attributes> </instance_attributes> </primitive> <primitive class="ocf" type="Filesystem" provider="heartbeat" id="fs_mysql"> <instance_attributes> <attributes> <nvpair name="device" value="/dev/drbd0"/> <nvpair name="directory" value="/var/lib/mysql"/> <nvpair name="type" value="ext3"/> </attributes> </instance_attributes> </primitive> <primitive class="ocf" type="IPaddr2" provider="heartbeat" id="ip_mysql"> <instance_attributes> <attributes> <nvpair name="ip" value="192.168.42.1"/> <nvpair name="cidr_netmask" value="24"/> <nvpair name="nic" value="eth0"/> </attributes> </instance_attributes> </primitive> <primitive class="lsb" type="mysqld" provider="heartbeat" id="mysqld"/> </group>
cibadmin -o resources -C -x
After this, Heartbeat will automatically propagate the newly-configured resource group to all cluster nodes.
drbd resource agent is a “pure-bred” OCF RA which provides Master/Slave capability, allowing Heartbeat to start and monitor the DRBD resource on multiple nodes and promoting and demoting as needed. You must, however, understand that the
drbd RA disconnects and detaches all DRBD resources it manages on Heartbeat shutdown, and also upon enabling standby mode for a node.
In order to enable a DRBD-backed configuration for a MySQL database in a Heartbeat CRM cluster with the
drbd OCF resource agent, you must create both the necessary resources, and Heartbeat constraints to ensure your service only starts on a previously promoted DRBD resource. It is recommended that you start with the constraints, such as shown in this example:
<constraints> <rsc_order id="mysql_after_drbd" from="rg_mysql" action="start" to="ms_drbd_mysql" to_action="promote" type="after"/> <rsc_colocation id="mysql_on_drbd" to="ms_drbd_mysql" to_role="master" from="rg_mysql" score="INFINITY"/> </constraints>
Assuming you put these settings in a file named
/tmp/constraints.xml, here is how you would enable them:
cibadmin -U -x /tmp/constraints.xml
<resources> <master_slave id="ms_drbd_mysql"> <meta_attributes id="ms_drbd_mysql-meta_attributes"> <attributes> <nvpair name="notify" value="yes"/> <nvpair name="globally_unique" value="false"/> </attributes> </meta_attributes> <primitive id="drbd_mysql" class="ocf" provider="heartbeat" type="drbd"> <instance_attributes id="ms_drbd_mysql-instance_attributes"> <attributes> <nvpair name="drbd_resource" value="mysql"/> </attributes> </instance_attributes> <operations id="ms_drbd_mysql-operations"> <op id="ms_drbd_mysql-monitor-master" name="monitor" interval="29s" timeout="10s" role="Master"/> <op id="ms_drbd_mysql-monitor-slave" name="monitor" interval="30s" timeout="10s" role="Slave"/> </operations> </primitive> </master_slave> <group id="rg_mysql"> <primitive class="ocf" type="Filesystem" provider="heartbeat" id="fs_mysql"> <instance_attributes id="fs_mysql-instance_attributes"> <attributes> <nvpair name="device" value="/dev/drbd0"/> <nvpair name="directory" value="/var/lib/mysql"/> <nvpair name="type" value="ext3"/> </attributes> </instance_attributes> </primitive> <primitive class="ocf" type="IPaddr2" provider="heartbeat" id="ip_mysql"> <instance_attributes id="ip_mysql-instance_attributes"> <attributes> <nvpair name="ip" value="10.9.42.1"/> <nvpair name="nic" value="eth0"/> </attributes> </instance_attributes> </primitive> <primitive class="lsb" type="mysqld" provider="heartbeat" id="mysqld"/> </group> </resources>
Assuming you put these settings in a file named
/tmp/resources.xml, here is how you would enable them:
cibadmin -U -x /tmp/resources.xml
After this, your configuration should be enabled. Heartbeat now selects a node on which it promotes the DRBD resource, and then starts the DRBD-backed resource group on that same node.
A Heartbeat CRM cluster node may assume control of cluster resources in the following ways:
Manual takeover of a single cluster resource. This is the approach normally taken if one simply wishes to test resource migration, or move a resource to the local node as a means of manual load balancing. This operation is performed using the following command:
resource-M -H `uname -n`
It is also important to understand that the migration is permanent, that is, unless told otherwise, Heartbeat will not move the resource back to a node it was previously migrated away from — even if that node happens to be the only surviving node in a near-cluster-wide system failure. This is undesirable under most circumstances. So, it is prudent to immediately “un-migrate” resources after successful migration, using the the following command:
Finally, it is important to know that during resource migration, Heartbeat may simultaneously migrate resources other than the one explicitly specified (as required by existing resource groups or colocation and order constraints).
A Heartbeat CRM cluster node may be forced to give up one or all of its resources in several ways.
Giving up a single cluster resource. A node gives up control of a single resource when issued the following command (note that the considerations outlined in the previous section apply here, too):
If you want to migrate to a specific host, use this variant:
However, the latter syntax is usually of little relevance to CRM clusters using DRBD, DRBD being limited to two nodes (so the two variants are, essentially, identical in meaning).
Switching a cluster node to standby mode. This is the approach normally taken if one simply wishes to test resource migration, or perform some other activity that does not require the node to leave the cluster. This operation is performed using the following command:
crm_standby -U `uname -n` -v on
Shutting down the local cluster manager instance. This approach is suited for local maintenance operations such as software updates which require that the node be temporarily removed from the cluster, but which do not necessitate a system reboot. The procedure is the same as for Heartbeat R1 style clusters.
Shutting down the local node. For hardware maintenance or other interventions that require a system shutdown or reboot, use a simple graceful shutdown command, just as previously outlined for Heartbeat R1 style clusters.