Problems mounting an OCFS2 filesystem under Linux – Transport endpoint is not connected

I had a problem mounting an ocfs2 filesystem under Oracle Cluster Management Software.

root@oracm1:~# mount -a
mount.ocfs2: Transport endpoint is not connected while mounting /dev/mapper/oravg01-lvu03 on /u03. Check ‘dmesg’ for more information on this error.

The error message said to check dmesg for more information and there it was. The cluster was not handshaking with the other node because the network timeout was different.

root@oracm1:~# dmesg
Buffer I/O error on device sdab, logical block 262143
(8290,0):o2net_check_handshake:1180 node oracm2 (num 1) at 192.168.2.101:7777 uses a network idle timeout of 10000 ms, but we use 30000 ms locally.  disconnecting
(8211,0):dlm_request_join:901 ERROR: status = -107
(8211,0):dlm_try_to_join_domain:1049 ERROR: status = -107
(8211,0):dlm_join_domain:1321 ERROR: status = -107
(8211,0):dlm_register_domain:1514 ERROR: status = -107
(8211,0):ocfs2_dlm_init:2024 ERROR: status = -107
(8211,0):ocfs2_mount_volume:1133 ERROR: status = -107
ocfs2: Unmounting device (253,7) on (node 0)

I entered the other cluster node to check the cluster status. The heartbeat is active and the timeout is 10000.

root@oracm2:~# service o2cb status
Module “configfs”: Loaded
Filesystem “configfs”: Mounted
Module “ocfs2_nodemanager”: Loaded
Module “ocfs2_dlm”: Loaded
Module “ocfs2_dlmfs”: Loaded
Filesystem “ocfs2_dlmfs”: Mounted
Checking O2CB cluster ocfs2: Online
Heartbeat dead threshold: 61
Network idle timeout: 10000
Network keepalive delay: 5000
Network reconnect delay: 2000
Checking O2CB heartbeat: Active

root@oracm1:~# service o2cb status
Module “configfs”: Loaded
Filesystem “configfs”: Mounted
Module “ocfs2_nodemanager”: Loaded
Module “ocfs2_dlm”: Loaded
Module “ocfs2_dlmfs”: Loaded
Filesystem “ocfs2_dlmfs”: Mounted
Checking O2CB cluster ocfs2: Online
Heartbeat dead threshold: 61
Network idle timeout: 30000
Network keepalive delay: 5000
Network reconnect delay: 2000
Checking O2CB heartbeat: Not active

The host oracm1 needs to have exactly the same configuration as oracm2, so I invoked the o2cb script with the configure parameter to reconfigure the timeout.

root@oracm1:~# service o2cb configure
Configuring the O2CB driver.

This will configure the on-boot properties of the O2CB driver.
The following questions will determine whether the driver is loaded on
boot.  The current values will be shown in brackets (‘[]’).  Hitting
<ENTER> without typing an answer will keep that current value.  Ctrl-C
will abort.

Load O2CB driver on boot (y/n) [y]: y
Cluster to start on boot (Enter “none” to clear) [ocfs2]: ocfs2
Specify heartbeat dead threshold (>=7) [61]: 61
Specify network idle timeout in ms (>=5000) [30000]: 10000
Specify network keepalive delay in ms (>=1000) [5000]: 5000
Specify network reconnect delay in ms (>=2000) [2000]: 2000
Writing O2CB configuration: OK
O2CB cluster ocfs2 already online

I restarted the o2cb service

root@oracm1:~# service o2cb stop
root@oracm1:~# service o2cb start

And checked that the node entered the cluster.

root@oracm1:$ORACLE_HOME/oracm/log # grep “Successful reconfiguration” cm.log
Successful reconfiguration,  2 active node(s) node 1 is the master, my node num is 0 (reconfig 22) {Thu Jun 11 13:17:32 2009 }

root@oracm1:~# df -h /u03
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/oravg01-lvu03
30G   29G  1.1G  97% /u03
root@oracm1:~# mount | grep u03
/dev/mapper/oravg01-lvu03 on /u03 type ocfs2 (rw,_netdev,datavolume,nointr,heartbeat=local)

Advertisement