How to add Fencing and Failover to cluster in Linux

Adding Fencing and Failover to Cluster

In this article we are going to discuss about adding fencing and failover to clustering.

Fencing

It isolates the malfunctioning server from the clusters to secure and protect the synced resources.

Failover

If you want to duplicate the data to another server then use the fail-over.


Adding Fencing device to Cluster Server

Execute the following two command to enable fencing on the cluster server.

[root@server ~]# ccs -h 192.168.5.111 --setfencedaemon post_fail_delay=0
[root@server ~]# ccs -h 192.168.5.111 --setfencedaemon post_join_delay=10

Command Description:
-h &rarr denotes Cluster host IP address.
post_join_delay &rarr represents Time in seconds which the daemon waits before fencing victim server when a node has joined the clusters.
&ndash setfencedaemon &rarr Applies the changes to the fencing daemon.
post_fail_delay &rarr represents Time in seconds which the daemon waits before fencing a victim server when a node has failed.

Execute the following command, to add a fence device for our cluster

[root@server ~]# ccs -h 192.168.5.111 --addfencedev linuxhelp agent=fence_virt
[root@server ~]# cd /etc/cluster/
[root@server cluster]# nano cluster.conf

After adding the fence device the cluster appears as follows.

< ?xml version=" 1.0" ?> 
< cluster config_version=" 14"  name=" cluster" > 
< fence_daemon post_join_delay=" 10" /> 
< clusternodes> 
< clusternode name=" 192.168.5.112"  nodeid=" 1" > 
< clusternode name=" 192.168.5.113"  nodeid=" 2" > 
< /clusternode> 
< cman/> 
< fencedevices> < fencedevice agent=" fence_virt"  name=" linuxhelp" /> 
< /fencedevices> 
< rm> 
< failoverdomains> 
< resources/> 
< /rm> 
< /cluster> 

Execute the following command, to create a fence device.

[root@server cluster]# ccs -h 192.168.5.111 --lsfenceopts
fence_apc - Fence agent for APC over telnet/ssh
fence_apc_snmp - Fence agent for APC, Tripplite PDU over SNMP
fence_bladecenter - Fence agent for IBM BladeCenter
fence_bladecenter_snmp - Fence agent for IBM BladeCenter over SNMP
.
.
.
fence_vmware - Fence agent for VMWare
fence_vmware_soap - Fence agent for VMWare over SOAP API
fence_wti - Fence agent for WTI
fence_xvm - Fence agent for virtual machines


Adding Two Nodes to Fence Device

Execute the following command to add hosts and methods to create fence device

[root@server cluster]# ccs -h 192.168.5.111 --addmethod Method01 192.168.5.112
Method Method01 added to 192.168.5.112.
[root@server cluster]# ccs -h 192.168.5.111 --addmethod Method01 192.168.5.113
Method Method01 added to 192.168.5.113.
[root@server cluster]# nano cluster.conf

After adding the nodes to the fence device, the cluster.conf files appears as follows.

< ?xml version=" 1.0" ?> 
< cluster config_version=" 14"  name=" cluster" > 
< fence_daemon post_join_delay=" 10" /> 
< clusternodes> 
< clusternode name=" 192.168.5.112"  nodeid=" 1" > 
< fence> 
< method name=" Method01" > 
< /fence> 
< /clusternode> 
< clusternode name=" 192.168.5.113"  nodeid=" 2" > 
< fence> 
< method name=" Method01" > 
< /fence> 
< /clusternode> 
< /clusternodes> 
< cman/> 
< fencedevices> 
< fencedevice agent=" fence_virt"  name=" linuxhelp" /> 
< /fencedevices> 
< rm> 
< failoverdomains/> 
< resources/> 
< /rm> 
< /cluster> 

To add the fence methods you have to create two nodes, to the fence device.

[root@server cluster]# ccs -h 192.168.5.111 --addfenceinst linuxhelp 192.168.5.112 Method01
[root@server cluster]# ccs -h 192.168.5.111 --addfenceinst linuxhelp 192.168.5.113 Method01

The cluster.conf looks like below.

[root@server cluster]# nano cluster.conf
< ?xml version=" 1.0" ?> 
< cluster config_version=" 14"  name=" cluster" > 
< fence_daemon post_join_delay=" 10" /> 
< clusternodes> 
< clusternode name=" 192.168.5.112"  nodeid=" 1" > 
< fence> 
< method name=" Method01" > 
< device name=" linuxhelp" /> 
< /method> 
< /fence> 
< /clusternode> 
< clusternode name=" 192.168.5.113"  nodeid=" 2" > 
< fence> 
< method name=" Method01" > 
< device name=" linuxhelp" /> 
< /method> 
< /fence> 
< /clusternode> 
< /clusternodes> 
< cman/> 
< fencedevices> 
< fencedevice agent=" fence_virt"  name=" linuxhelp" /> 
< /fencedevices> 
< rm> 
< /failoverdomains> 
< resources/> 
< /rm> 
< /cluster> 

Here we successfully configured fence device.


Adding Failover to Cluster Server

To add two nodes

To add failover for two nodes execute the following command.

[root@server cluster]# ccs -h 192.168.5.111 --addfailoverdomainnode linuxfail 192.168.5.112 1
[root@server cluster]# ccs -h 192.168.5.111 --addfailoverdomainnode linuxfail 192.168.5.113 2

After adding nodes, cluster.conf file appears as shown below.

[root@server cluster]# nano cluster.conf
< ?xml version=" 1.0" ?> 
< cluster config_version=" 14"  name=" cluster" > 
< fence_daemon post_join_delay=" 10" /> 
< clusternodes> 
< clusternode name=" 192.168.5.112"  nodeid=" 1" > 
< fence> 
< method name=" Method01" > 
< device name=" linuxhelp" /> 
< /method> 
< /fence> 
< /clusternode> 
< clusternode name=" 192.168.5.113"  nodeid=" 2" > 
< fence> 
< method name=" Method01" > 
< device name=" linuxhelp" /> 
< /method> 
< /fence> 
< /clusternode> 
< /clusternodes> 
< cman/> 
< fencedevices> 
< fencedevice agent=" fence_virt"  name=" linuxhelp" /> 
< /fencedevices> 
< rm> 
< failoverdomains> 
< failoverdomain name=" linuxfail"  nofailback=" 0"  ordered=" 1"  restricted=" 0" > 
< failoverdomainnode name=" 192.168.5.112"  priority=" 1" /> 
< failoverdomainnode name=" 192.168.5.113"  priority=" 2" /> 
< /failoverdomain> 
< /failoverdomains> 
< resources/> 
< /rm> 
< /cluster> 
FAQ
Q
Can the shared filesystem exist on remote file server ?
A
Yes. The shared file system that stores the queue manager data and log should be accessible by the queue manager process.
Q
How can I rename my cluster?
A
If you have GFS partitions in your cluster, issue the following command to change their superblock to use the new cluster name: # gfs_tool sb /dev/vg_name/gfs1 table new_cluster_name:gfs1
Q
Do I need shared storage?
A
No. We can help manage it if you have some, but Pacemaker itself has no need for shared storage.
Q
What is Quorum Disk?
A
A quorum disk is the storage type of cluster configurations. It acts like a database which holds the data related to clustered environment and duty of the quorum disk is to inform the cluster
Q
Difference between Fencing and Fialover?
A
The transition from the active namenode to the standby is managed by a new entity in the system called the failover controller.  Failover controllers are pluggable, but the first implementati