Pacemaker beginner “tips”

This post just summarize some answers to question I’ve asked myself as a linux-ha beginner.

There are two guides out there that help understanding the concepts and syntax: “Cluster from scratch” [1], and “Cluster configuration explained” [2], though, there are certain subtleties I had a difficult time to find and/or understand, that’s why I decided to share my poor experience. IRC Freenode #linux-ha is a good place to ask for help too.

  1. About ocf:pacemaker:ping resource: in order to monitor the scores associated with each node by the ping resource, you can use:
    • cibadmin -Q | grep pingd | grep value
    • crm_mon -Arf1 | grep ping
  2. To prevent moving resources on loss of a common ping node, you might want to have
    dampen >= 2*ping.op_monitor-interval. Read doc[2] for dampen explanations.

  3. Location constraints based on connectivity have to use the ocf:pacemaker:ping resource’s name, not the primitive id. Most of the howtos out there to create a ping resource don’t fill the name parameters but only the primitive’s id (reminder: primitive id class:provider:type params name=foo host_list=...). With an empty name, you have to use the default name for an ocf:pacemaker:ping resource which is pingd.
    location IPHA-on-connected-node IPHA \
        rule $id="IPHA-on-connected-node-rule" pingd: defined pingd

    This constraint (with a score of pingd: instead of +/-INF:) is explained in a good blog entry that summarize ping scoring behavior, syntax and formula. To understand ping scoring, you must read link[3].

  4. If you want to receive SNMP traps whenever a resource changes state, you should create an ocf:heartbeat:ClusterMon resource which runs crm_mon in the background:
    primitive SNMPMonitor ocf:heartbeat:ClusterMon \
        params pidfile="/var/run/" extra_options="-S -C public" \
        op monitor on-fail="restart" interval="10s"

    Beware, since pacemaker-cli-1.1.6-1, snmp support is not built by default in crm_mon. See sudo rpm -q --changelog pacemaker-cli | less and read my solution here

  5. The <op> tag is used to define parameters for operations performed by the cluster such as starting or stopping a resource. Eg, you can tell pacemaker that one of your resources takes a long time to start using <op start timeout="3min" ...> (same goes for stop of course). If you don’t, pacemaker will decide your resource has failed because of the default built-in timeout for the start operation ! (see point number 5 below for a concrete example). Finally, the interval parameter is only used for repetitive operations, the only one right now beeing monitor :
    primitive firewall lsb:my-complex-firewall-initscript \
        op monitor on-fail="restart" interval="10s" \
        op start interval="0" timeout="3min" \
        op stop interval="0" timeout="1min" \
        meta target-role="Started"
  6. Prior to CentOS 6.2 (I haven’t been able to find the BZ#id in the release notes…), there is an uneeded and bugged check in the shell code of the ip_stop() function in ocf:heartbeat:IPaddr (/usr/lib/ocf/resource.d/heartbeat/IPaddr).
    When trying to stop such a resource, before deleting the alias, the command if route | grep $IP ; then ... will screw your cluster in two case: your node has a really really big local routing table (BGP ?) or you don’t have any DNS resolver reachable.
    The failure will happen because route will take more than 20 seconds which is the default timeout for a stop action. The resource will have an INFINITY failcount and go unmanaged, if it’s part of a bigger shutdown process, it will break here and other node(s) won’t be able to pick up resources: fencing occurs.
    In the process of fixing this issue, route has first been replaced with route -n which is indeed way faster but can also require more than 20 seconds to be browsed (for example a BGP router can have up to 350,000 routes). Finally, it has been totally removed: problem solved. So, you can either: update to 6.2, patch your IPaddr shell script, patch from GitHub, or move to IPaddr2 god damn it ! =)

  8. Virtual IP created using an ocf:heartbeat:IPaddr2 resource aren’t visible in ifconfig. That’s intended, you’ll have to start using /sbin/ip. Eg: ip addr ls eth0 to see all eth0 aliases.

  10. By default, when a failed node comes back online it claims back it’s old resources, meaning they are moved, again. You can avoid this by setting a non-zero resource-stickiness.


3 comments so far.

  1. “Two mutually exclusive nodes cluster architectures” — er, what? How are Pacemaker and Corosync mutually exclusive? It’s rather quite the contrary, they complement each other.

    • I meant active-passive architecture, wrong words I guess ;)
      Anyway the old version of this post has been replaced since it had no uses to me anymore and to many flows in the configuration.

  2. Hi!

    There’s a option in ocf:heartbeat:IPaddr2 that shows the IP with ifconfig: iflabel.

    I wrote a little about it here:

    Thanks for your tips!!

Share your thoughts