Category: Linux

Fencing agent ipmi_ilo incompatible with Red Hat Cluster and HP iLO3

I have a two node Red Hat Cluster with HP Proliant servers.

This server has an HP iLO3 for Out-of-band management.

The fence_ilo agent works with iLO and iLO2.
Fence Device and Agent Information for Red Hat Enterprise Linux

You will need to use fence_ipmplan agent with the recommended settings following this Red Hat document:
How do you configure the fence device agent information option for the HP ILO 3?

Checking WWPN for a Linux host

Here is a dual port HBA

root@linux:~ # lspci | grep -i fibre
06:00.0 Fibre Channel: Emulex Corporation Zephyr-X LightPulse Fibre Channel Host Adapter (rev 02)
06:00.1 Fibre Channel: Emulex Corporation Zephyr-X LightPulse Fibre Channel Host Adapter (rev 02)

Verifying the WWPN from the HBA

root@linux:~ # cat /sys/class/scsi_host/host0/device/fc_host\:host0/port_name
0x10000000c99f46b4
root@linux:~ # cat /sys/class/scsi_host/host1/device/fc_host\:host1/port_name
0x10000000c99f46b5

The WWPN show above is:
10:00:00:00:c9:9f:46:b4
10:00:00:00:c9:9f:46:b5

Or you can use systool instead of messing with /proc

root@linux:~ # systool -av -c fc_host
Class = “fc_host”

Class Device = “host0”
Class Device path = “/sys/class/fc_host/host0”
active_fc4s = “0x00 0x00 0x01 0x00 0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 ”
fabric_name = “0x100000051ef61a00”
issue_lip =
maxframe_size = “2048 bytes”
node_name = “0x20000000c99f46b4”
port_id = “0x648acd”
port_name = “0x10000000c99f46b4”
port_state = “Online”
port_type = “NPort (fabric via point-to-point)”
speed = “4 Gbit”
supported_classes = “Class 3”
supported_fc4s = “0x00 0x00 0x01 0x00 0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 ”
supported_speeds = “1 Gbit, 2 Gbit, 4 Gbit”
tgtid_bind_type = “wwpn (World Wide Port Name)”
uevent =

Device = “host0”
Device path = “/sys/devices/pci0000:00/0000:00:07.0/0000:06:00.0/host0”
uevent =

Class Device = “host1”
Class Device path = “/sys/class/fc_host/host1”
active_fc4s = “0x00 0x00 0x01 0x00 0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 ”
fabric_name = “0x1000000533021c00”
issue_lip =
maxframe_size = “2048 bytes”
node_name = “0x20000000c99f46b5”
port_id = “0xc88ac5”
port_name = “0x10000000c99f46b5”
port_state = “Online”
port_type = “NPort (fabric via point-to-point)”
speed = “4 Gbit”
supported_classes = “Class 3”
supported_fc4s = “0x00 0x00 0x01 0x00 0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 ”
supported_speeds = “1 Gbit, 2 Gbit, 4 Gbit”
tgtid_bind_type = “wwpn (World Wide Port Name)”
uevent =

Device = “host1”
Device path = “/sys/devices/pci0000:00/0000:00:07.0/0000:06:00.1/host1”
uevent =

Deleted file in Linux but didn’t reclaim space in filesystem

The reason that the space in the filesystem wasn’t reclaimed is because the file was opened and in use by another application

root@linux:~ # lsof /bkpcvrd | grep deleted
dsmc 21215 root 11r REG 253,9 4268860136 1097761 /bkpcvrd/pbh020/export/pbh020_20111001.dmp.gz (deleted)
dsmc 30379 root 8r REG 253,9 4268860136 1097761 /bkpcvrd/pbh020/export/pbh020_20111001.dmp.gz (deleted)
dsmc 32691 root 9r REG 253,9 4268860136 1097761 /bkpcvrd/pbh020/export/pbh020_20111001.dmp.gz (deleted)

Check the directory where the file descriptors for the PID is

root@linux:/proc/21215/fd # ls -l
total 12
l-wx—— 1 root root 64 2011-10-04 10:09 0 -> /dev/null
l-wx—— 1 root root 64 2011-10-04 10:09 1 -> /root/nohup.out
lrwx—— 1 root root 64 2011-10-04 10:09 10 -> socket:/[32923523]
lr-x—— 1 root root 64 2011-10-04 10:09 11 -> /bkpcvrd/pbh020/export/pbh020_20111001.dmp.gz (deleted)
l-wx—— 1 root root 64 2011-10-04 10:09 2 -> /dev/null
l-wx—— 1 root root 64 2011-10-04 10:09 3 -> /opt/tivoli/tsm/client/ba/bin/dsmerror.log
l-wx—— 1 root root 64 2011-10-04 10:09 4 -> /opt/tivoli/tsm/client/ba/bin/dsmsched.log
lrwx—— 1 root root 64 2011-10-04 10:09 5 -> socket:/[31319408]
lrwx—— 1 root root 64 2011-10-04 10:09 6 -> socket:/[31319409]
lrwx—— 1 root root 64 2011-10-04 10:09 7 -> socket:/[33998182]
lrwx—— 1 root root 64 2011-10-04 10:09 8 -> socket:/[33083683]
lr-x—— 1 root root 64 2011-10-04 10:09 9 -> /usr/oradata/orapbh020/tbd1/LENEL_DATA.D001

You previously saw that the there is a link and in the end it says deleted. Type > in the number that was shown in that line

root@linux:/proc/21215/fd # > 11

List the other file descriptors

root@linux:/proc/30379/fd # ls -l
total 11
lrwx—— 1 root root 64 2011-10-04 10:09 0 -> /dev/console
l-wx—— 1 root root 64 2011-10-04 10:09 1 -> /dev/null
lrwx—— 1 root root 64 2011-10-04 10:09 10 -> socket:/[33912163]
lr-x—— 1 root root 64 2011-10-04 10:09 11 -> /usr/oradata/orapbh020/tbd1/LENEL_DATA.D001
l-wx—— 1 root root 64 2011-10-04 10:09 2 -> /dev/null
l-wx—— 1 root root 64 2011-10-04 10:09 3 -> /opt/tivoli/tsm/client/ba/bin/dsmerror.log
l-wx—— 1 root root 64 2011-10-04 10:09 4 -> /opt/tivoli/tsm/client/ba/bin/dsmsched.log
lrwx—— 1 root root 64 2011-10-04 10:09 5 -> socket:/[21055400]
lrwx—— 1 root root 64 2011-10-04 10:09 6 -> socket:/[31339242]
lrwx—— 1 root root 64 2011-10-04 10:09 7 -> socket:/[33078656]
lrwx—— 1 root root 64 2011-10-04 10:09 9 -> socket:/[32744402]

root@linux:/proc/32691/fd # ls -l
total 11
l-wx—— 1 root root 64 2011-10-04 10:09 0 -> /dev/null
l-wx—— 1 root root 64 2011-10-04 10:09 1 -> /opt/tivoli/tsm/client/ba/bin/nohup.out
lr-x—— 1 root root 64 2011-10-04 10:09 10 -> /bkpcvrd/pbh020/export/pbh020_20111002.dmp.gz
l-wx—— 1 root root 64 2011-10-04 10:09 2 -> /dev/null
l-wx—— 1 root root 64 2011-10-04 10:09 3 -> /opt/tivoli/tsm/client/ba/bin/dsmerror.log
l-wx—— 1 root root 64 2011-10-04 10:09 4 -> /opt/tivoli/tsm/client/ba/bin/dsmsched.log
lrwx—— 1 root root 64 2011-10-04 10:09 5 -> socket:/[33364746]
lrwx—— 1 root root 64 2011-10-04 10:09 6 -> socket:/[33364747]
lrwx—— 1 root root 64 2011-10-04 10:09 7 -> socket:/[33365181]
lrwx—— 1 root root 64 2011-10-04 10:09 8 -> socket:/[33365142]
lr-x—— 1 root root 64 2011-10-04 10:09 9 -> /bkpcvrd/pbh020/export/pbh020_20111003.dmp.gz

The space is now reclaimed

root@linux:~ # df -h /bkpcvrd
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/rootvg-bkpcvrdlv
17G 8.2G 7.3G 53% /bkpcvrd

Listing the open files in the filesystem to see if there is still an open file with the deleted status

root@linux:~ # lsof /bkpcvrd | grep deleted
root@linux:~ #

Which package contains rstatd for Red Hat Enterprise Linux 5.5

The service rstatd is in package rusers-server

https://access.redhat.com/kb/docs/DOC-51169

After installing the package, set the service to start at boot

root@linux:~ # chkconfig rstatd on
root@linux:~ # chkconfig —list | grep rstat
rstatd 0:off 1:off 2:on 3:on 4:on 5:on 6:off
root@linux:~ # chkconfig rusersd on
root@linux:~ # chkconfig —list | grep rusers
rusersd 0:off 1:off 2:on 3:on 4:on 5:on 6:off

Start the services

root@linux:~ # service rstatd start
Starting rstat services: [ OK ]
root@linux:~ # service rusersd start
Starting rusers services: [ OK ]

Check the status

root@linux:~ # service rstatd status
rpc.rstatd (pid 2269) is running…
root@linux:~ # service rusersd status
rpc.rusersd (pid 8147 6266 3464) is running…

Suse Linux SP2 hwinfo: braille.4.2: alva read data freezes

I ran the command hwinfo and waited a long time seeing this message

root@suselinux10sp2:~ # hwinfo
> braille.4.2: alva read data

I checked the hwinfo package

root@suselinux10sp2:~ # rpm -qi hwinfo
Name : hwinfo Relocations: (not relocatable)
Version : 12.55 Vendor: SUSE LINUX Products GmbH, Nuernberg, Germany
Release : 0.3 Build Date: Wed 23 Apr 2008 08:10:20 PM BRT
Install Date: Fri 05 Sep 2008 08:11:38 AM BRT Build Host: janacek.suse.de
Group : Hardware/Other Source RPM: hwinfo-12.55-0.3.src.rpm
Size : 1828813 License: GPL v2 or later
Signature : DSA/SHA1, Wed 23 Apr 2008 08:13:01 PM BRT, Key ID a84edae89c800aca
Packager : http://bugs.opensuse.org
Summary : Hardware Library
Description :
A simple program that lists results from the hardware detection
library.
Distribution: SUSE Linux Enterprise 10 (X86-64)

And the distribution that I was using

root@suselinux10sp2:~ # cat /etc/*release
LSB_VERSION=”core-2.0-noarch:core-3.0-noarch:core-2.0-x86_64:core-3.0-x86_64″
SUSE Linux Enterprise Server 10 (x86_64)
VERSION = 10
PATCHLEVEL = 2

Then I’ve updated the package hwinfo with the one on the Suse Linux 10 SP4 ISO – hwinfo-12.67-0.7.21.x86_64.rpm

root@suselinux10sp2 # rpm -U hwinfo-12.67-0.7.21.x86_64.rpm
root@suselinux10sp2 #

hpacucli – Error: Another instance of ACU is already running (possibly a service)

If you receive this error message

root@linux:~ # /usr/sbin/hpacucli
HP Array Configuration Utility CLI 8.70-8.0
Detecting Controllers…

Error: Another instance of ACU is already running (possibly a service). Please
terminate the ACU application before running the ACU CLI. Press ENTER to
exit.

But there is no process running

root@linux:~ # ps -ef | grep -i acu
root 4805 32086 0 10:00 pts/0 00:00:00 grep -i acu

Delete all files on /opt/compaq/cpqacuxe/bld/locks to solve this problem

root@linux:/opt/compaq/cpqacuxe/bld/locks # ls
. .. CPQACU_MUTEX

In another occasion, solved this problem by deleting file /dev/shm/sem.hpacu.appLock on a RHEL 5

root@linux:~ # ls -l /dev/shm/sem.hpacu.appLock
-rw-r–r– 1 root root 32 Aug 16 05:21 /dev/shm/sem.hpacu.appLock

root@linux:~ # rm /dev/shm/sem.hpacu.appLock
rm: remove regular file `/dev/shm/sem.hpacu.appLock’? y

Checking LUN status in a HP Smart Array

To check a LUN that was created in a HP Proliant with Smart Array you need to have a package called hpacucli installed

root@linux:~ # rpm -qi hpacucli
Name        : hpacucli                     Relocations: (not relocatable)
Version     : 8.70                              Vendor: Hewlett-Packard Company
Release     : 8.0                           Build Date: Thu Dec  2 00:43:51 2010
Install date: Wed May 11 10:34:58 2011      Build Host: Prowl
Group       : Applications/System           Source RPM: hpacucli-8.70-8.0.src.rpm
Size        : 17788857                         License: See hpacucli.license
Signature   : (none)
Packager    : Hewlett-Packard Company
URL         : http://www.hp.com/linux
Summary     : HP Command Line Array Configuration Utility
Description :
The HP Command Line Array Configuration Utility is the disk
array configuration program for Array Controllers.
Distribution: (none)

Check which version you’re running

root@linux:~ # cat /etc/*release
SUSE LINUX Enterprise Server 9 (i586)
VERSION = 9
PATCHLEVEL = 3

Download it on this link: http://h18000.www1.hp.com/products/servers/proliantstorage/software-management/acumatrix/index.html

Install the package

root@linux:~ # rpm -ivh hpacucli-8.70-8.0.noarch.rpm
Preparing…                ########################################### [100%]
1:hpacucli               ########################################### [100%]

And run hpacucli

root@linux:~ # hpacucli ctrl all show config
Smart Array 642 in Slot 3                 (sn: P92260YXQT80I8)
array A (Parallel SCSI, Unused Space: 0 MB)
logicaldrive 1 (279.4 GB, RAID 1, OK)
physicaldrive 2:0   (port 2:id 0 , Parallel SCSI, 300 GB, OK)
physicaldrive 2:1   (port 2:id 1 , Parallel SCSI, 300 GB, OK)

Checking status of all controllers

root@linux:~ # hpacucli ctrl all show status
Smart Array 6i in Slot 0 (Embedded)
Controller Status: OK
Cache Status: OK

root@linux:~ # hpacucli ctrl all show config detail

Smart Array E200 in Slot 3
Bus Interface: PCI
Slot: 3
Serial Number: PA6C9%%BFTTEZI
Cache Serial Number: P9A3A0B9SUB9YB
RAID 6 (ADG) Status: Disabled
Controller Status: OK
Chassis Slot:
Hardware Revision: Rev A
Firmware Version: 1.82
Rebuild Priority: Medium
Expand Priority: Medium
Surface Scan Delay: 3 secs
Post Prompt Timeout: 15 secs
Cache Board Present: True
Cache Status: OK
Accelerator Ratio: 50% Read / 50% Write
Drive Write Cache: Disabled
Total Cache Size: 128 MB
Battery Pack Count: 1
Battery Status: OK
SATA NCQ Supported: False

Array: A
Interface Type: SAS
Unused Space: 0 MB
Status: OK

Logical Drive: 1
Size: 410.1 GB
Fault Tolerance: RAID 5
Heads: 255
Sectors Per Track: 32
Cylinders: 65535
Stripe Size: 64 KB
Status: OK
Array Accelerator: Enabled
Parity Initialization Status: Initialization Completed
Unique Identifier: 600508B100102542465454455A490012
Disk Name: /dev/cciss/c0d0
Mount Points: / 10.0 GB, swap 5.0 GB, /boot 513 MB
Logical Drive Label: A03C5226PA6C9%%BFTTEZI6260

physicaldrive 1I:1:1
Port: 1I
Box: 1
Bay: 1
Status: OK
Drive Type: Data Drive
Interface Type: SAS
Size: 146 GB
Rotational Speed: 10000
Firmware Revision: HPD6
Serial Number: BS05P880BW8H0834
Model: HP DG146BABCF
PHY Count: 2
PHY Transfer Rate: 3.0GBPS, Unknown
physicaldrive 1I:1:2
Port: 1I
Box: 1
Bay: 2
Status: OK
Drive Type: Data Drive
Interface Type: SAS
Size: 146 GB
Rotational Speed: 10000
Firmware Revision: HPDD
Serial Number: 3NM4P21D0000983193YT
Model: HP DG146ABAB4
PHY Count: 1
PHY Transfer Rate: 3.0GBPS
physicaldrive 1I:1:3
Port: 1I
Box: 1
Bay: 3
Status: OK
Drive Type: Data Drive
Interface Type: SAS
Size: 146 GB
Rotational Speed: 10000
Firmware Revision: HPD6
Serial Number: PCY2S4AE
Model: HP DG0146FARVU
PHY Count: 2
PHY Transfer Rate: 3.0GBPS, Unknown
physicaldrive 1I:1:4
Port: 1I
Box: 1
Bay: 4
Status: OK
Drive Type: Data Drive
Interface Type: SAS
Size: 146 GB
Rotational Speed: 10000
Firmware Revision: HPDD
Serial Number: 3NM4LJ4Y00009831MEQU
Model: HP DG146ABAB4
PHY Count: 1
PHY Transfer Rate: 3.0GBPS


Quick cheat sheet on how to use the hpacucli taken from http://www.datadisk.co.uk/html_docs/redhat/hpacucli.htm

Utility Keyword abbreviations

Abbreviations chassisname = ch
controller = ctrl
logicaldrive = ld
physicaldrive = pd
drivewritecache = dwc

hpacucli utility

hpacucli # hpacucli# hpacucli helpNote: you can use the hpacucli command in a script

Controller Commands

Display (detailed) hpacucli> ctrl all show config
hpacucli> ctrl all show config detail
Status hpacucli> ctrl all show status
Cache hpacucli> ctrl slot=0 modify dwc=disable
hpacucli> ctrl slot=0 modify dwc=enable
Rescan hpacucli> rescanNote: detects newly added devices since the last rescan

Physical Drive Commands

Display (detailed) hpacucli> ctrl slot=0 pd all show
hpacucli> ctrl slot=0 pd 2:3 show detailNote: you can obtain the slot number by displaying the controller configuration (see above)
Status hpacucli> ctrl slot=0 pd all show status
hpacucli> ctrl slot=0 pd 2:3 show status
Erase hpacucli> ctrl slot=0 pd 2:3 modify erase
Blink disk LED hpacucli> ctrl slot=0 pd 2:3 modify led=on
hpacucli> ctrl slot=0 pd 2:3 modify led=off

Logical Drive Commands

Display (detailed) hpacucli> ctrl slot=0 ld all show [detail]
hpacucli> ctrl slot=0 ld 4 show [detail]
Status hpacucli> ctrl slot=0 ld all show status
hpacucli> ctrl slot=0 ld 4 show status
Blink disk LED hpacucli> ctrl slot=0 ld 4 modify led=on
hpacucli> ctrl slot=0 ld 4 modify led=off
re-enabling failed drive hpacucli> ctrl slot=0 ld 4 modify reenable forced
Create # logical drive – one disk
hpacucli> ctrl slot=0 create type=ld drives=1:12 raid=0# logical drive – mirrored
hpacucli> ctrl slot=0 create type=ld drives=1:13,1:14 size=300 raid=1# logical drive – raid 5
hpacucli> ctrl slot=0 create type=ld drives=1:13,1:14,1:15,1:16,1:17 raid=5Note:
drives – specific drives, all drives or unassigned drives
size – size of the logical drive in MB
raid – type of raid 0, 1 , 1+0 and 5
Remove hpacucli> ctrl slot=0 ld 4 delete
Expanding hpacucli> ctrl slot=0 ld 4 add drives=2:3
Extending hpacucli> ctrl slot=0 ld 4 modify size=500 forced
Spare hpacucli> ctrl slot=0 array all add spares=1:5,1:7

Skip or force fsck when rebooting a Linux server

If you issue this command, the system will create a file /fastboot

root@linux:~ # shutdown -rf now

or

root@linux:~ # touch /fastboot
root@linux:~ # shutdown -r now

You can also pass the arguments to grub prompt to skip fsck on boot

grub> kernel /vmlinuz-2.6.16.60-0.66.1-smp root=/dev/rootvg/rootlv vga=0x317 resume=/dev/rootvg/swaplv splash=silent showopts fastboot rootdelay=10

Create a /forcefsck file or reboot your computer with -F option to force a fsck on boot

root@linux:~ # shutdown -rF now

or

root@linux:~ # touch /forcefsck
root@linux:~ # shutdown -r now

Linux – Password has been used already. Choose another

root@linux:~ # passwd emerson
Changing password for emerson.
New Password:
Reenter New Password:
Password has been used already. Choose another.
Password changed

Linux is keeping the old password stored on /etc/security/opasswd. Delete the line containing the user that you’re trying to change the password

You can also check the file /etc/pam.d/common-password and look for a line with the parameter remember.

password required pam_pwhistory.so use_authtok remember=6 retry=3

End of life information about HP-UX, Solaris, AIX and Linux

If you need to know the if a release of an Unix operating system is still supported by the vendor, check these links for information

End of life information about HP-UX (PDF)

End of life information about Solaris

End of life information about AIX

End of life information about Red Hat Enterprise Linux

End of life information about Suse Linux Enterprise