Author: Emerson .

UXMON MODULE got block

Node : solaris.setaoffice.com
Node Type : Sun SPARC (HTTPS)
Severity : warning
OM Server Time: 2015-12-12 23:27:48
Message : UXMON MODULE dfmon got block
Msg Group : OS
Application : uxmon
Object : uxmon
Event Type :
not_found

Instance Name :
not_found

Instruction : This message informs about a uxmon module that found an OS command
that get blocked. This cause that module not able to complete its task
and be pending forever.

Please check that module and why gets blocked.
For this purpose use manual module execution in debug mode:
/var/opt/OV/bin/instrumentation/UXMONbroker -d (case of HTTPS – OVO8 agent)
/var/opt/OV/bin/OpC/cmds/UXMONbroker -d (case of DCE – OVO7 agent)

If needed disable that specific module.

Please, inform the UX Admin or Technical Lead of this box about this situation

A module is running over 5 minutes (default). You may want to increase the timeout or research why the module is running over 5 minutes

root@solaris:/ # cp /var/opt/OV/bin/instrumentation/UXMONbroker.cfg /var/opt/OV/conf/OpC

root@solaris:/ # vi /var/opt/OV/conf/OpC/UXMONbroker.cfg
##########################
# TIMEOUT for OS commands. It will wait only those seconds, when this is timeout, it will trigger ‘got block’ alarm.
$UXMON_OS_TIMEOUT = 1800 ;

resize2fs: Filesystem has unsupported feature(s) while trying to open

Tried to use resize2fs on a logical volume after expanding it and gave the error that couldn’t find a valid filesystem superblock and filesystem has unsupported features.

root@linux:~ # resize2fs /dev/mapper/vg00-tmpvol
resize2fs 1.39 (29-May-2006)
resize2fs: Filesystem has unsupported feature(s) while trying to open /dev/mapper/vg00-tmpvol
Couldn’t find valid filesystem superblock.

Check if the filesystem is ext4

root@linux:~ # mount | grep tmpvol
/dev/mapper/vg00-tmpvol on /tmp type ext4 (rw)

Use ext4 version of the tools: tune4fs, resize4fs

Linux – command ls or df hangs on /

If you run ls or df -h, these commands will appear that hung

Check if you have NFS shares mounted

root@linux:~ # mount | grep nfs
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
nfsd on /proc/fs/nfsd type nfsd (rw)
mdmMPC:/export/sapmnt/MPC on /sapmnt/MPC type nfs (rw,nfsvers=3,proto=tcp,noac,soft,sloppy,addr=10.106.10.118)
linuxnfs25:/oracle/HP0/sapdata1/DUMP on /dump type nfs (rw,addr=142.40.81.32)
nfshp0:/export/sapmnt/HP0/exe on /sapmnt/HP0/exe type nfs (rw,nfsvers=3,proto=udp,noac,soft,sloppy,addr=10.106.10.28)
nfshp0:/export/sapmnt/HP0/profile on /sapmnt/HP0/profile type nfs (rw,nfsvers=3,proto=udp,noac,soft,sloppy,addr=10.106.10.28)

I have this share that the server was turned off. So I tried to umount the share

root@linux:~ # umount /dump
umount.nfs: /dump: device is busy
umount.nfs: /dump: device is busy

And even forcing but no luck

root@linux:~ # umount -f /dump
umount2: Device or resource busy
umount.nfs: /dump: device is busy
umount2: Device or resource busy
umount.nfs: /dump: device is busy

umount with -l to do a lazy unmount

root@linux:~ # umount -l /dump
root@linux:~ # df -h /dump
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vgroot-lv_root
7.8G 818M 6.6G 11% /

ping Error: Destination Host Unreachable in a Linux VMware guest

Server was unable to find another server in the same network

root@linux:~ # ping -c1 10.32.17.68
PING 10.32.17.68 (10.32.17.68) 56(84) bytes of data.
From 10.106.4.138: icmp_seq=1 Destination Host Unreachable

— 10.32.17.68 ping statistics —
1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms

When I ran traceroute it shows (H!)

root@linux:~ # traceroute 10.32.17.68
traceroute to 10.32.17.68 (10.32.17.68), 30 hops max, 40 byte packets using UDP
1 * * *
2 * * *
3 * * *
4 * * *
5 * * *
6 linux.bkp.setaoffice.com (10.106.4.138)(H!) 2983.660 ms (H!) 2982.519 ms (H!) 2981.394 ms

This server is a VMware guest. Check if your physical host doesn’t have a configuration that prevents the connection from working

root@linux:~ # dmidecode –type system
# dmidecode 2.9
SMBIOS 2.4 present.

Handle 0x0001, DMI type 1, 27 bytes
System Information
Manufacturer: VMware, Inc.
Product Name: VMware Virtual Platform
Version: None
Serial Number: VMware-42 10 66 9e 4f bf f3 14-93 30 e0 66 8f 0c 14 7c
UUID: 4210669E-4FBF-F314-9330-E0668F0C147C
Wake-up Type: Power Switch
SKU Number: Not Specified
Family: Not Specified

Handle 0x0081, DMI type 15, 29 bytes
System Event Log
Area Length: 16 bytes
Header Start Offset: 0x0000
Header Length: 16 bytes
Data Start Offset: 0x0010
Access Method: General-purpose non-volatile data functions
Access Address: 0x0000
Status: Invalid, Full
Change Token: 0x00000036
Header Format: Type 1
Supported Log Type Descriptors: 3
Descriptor 1: POST error
Data Format 1: POST results bitmap
Descriptor 2: Single-bit ECC memory error
Data Format 2: Multiple-event
Descriptor 3: Multi-bit ECC memory error
Data Format 3: Multiple-event

Handle 0x0105, DMI type 23, 13 bytes
System Reset
Status: Enabled
Watchdog Timer: Present
Boot Option: Do Not Reboot
Boot Option On Limit: Do Not Reboot
Reset Count: Unknown
Reset Limit: Unknown
Timer Interval: Unknown
Timeout: Unknown

Handle 0x0108, DMI type 32, 20 bytes
System Boot Information
Status: No errors detected

HP Data Protector – [61:1002] The VBDA named “backuphost.setaoffice.com [/usr/software/ctmagent]” on host hpux.setaoffice.com reached its inactivity timeout of 28800 seconds.

[Major] From: BSM@backuphost.setaoffice.com “OS_FILES_GRP_BACKUPHOST_DAILY_1800_03_ESL” Time: 1/11/2016 7:11:54 AM
[61:1002] The VBDA named “backuphost.setaoffice.com [/usr/software/ctmagent]” on host hpux.setaoffice.com
reached its inactivity timeout of 28800 seconds.
The agent on host will be shutdown.

The error message says that Data Protector is not finishing before the timeout that was set.

Listing the files in the filesystem

ls -lR /usr/software/ctmagent

It is stopping in this directory

/usr/software/ctmagent/ctm/install/perl/5_8_4/lib/5.8.4/IA64.ARCHREV_0-thread-multi-LP64

HP DDMI exit: rc = 6

root@linux:~ # /opt/hps/inventory/bin/HPS_SCANNER_linux-x86 -log:debug

HP Discovery and Dependency Mapping Inventory v9.32.003 Build 1130 linux-x86
(C) Copyright 1993-2015 Hewlett-Packard Development Company, L.P.
Includes GNU ISO C++ Library, GNU GCC Shared Support Library and GNU C Library, Copyright (C) 1987-2008 Free Software Foundation, Inc. released under LGPL, see the file COPYING.LIB for license details.

+ reading scanner parameters
Debug: Scanner PID: 16496
Debug: Scanner Stage: Initialization
Debug: wxString OSCreateTempFileName(const wxString&): Creating temp file in /tmp.
Debug: CSingleInstanceChecker::CSingleInstanceChecker(const wxString&): successfully created temp file: sclEnLody
end of scan
Debug: Scanner Status: end of scan
exit: rc = 6
Debug: Scanner Status: exit: rc = 6
Debug: Scanner Exitcode: 6
Debug: Scanner Stage: Exit
Debug: void CScannerApp::RemoveFileNameFromDeleteList(const wxString&, bool): path: /tmp/edscan.lck, delete? no
Debug: void CScannerApp::RemoveFileNameFromDeleteList(const wxString&, bool): path: , delete? yes
Debug: Stop to update scanner status!

Stopping DDMI

root@linux:~ # /etc/init.d/lw_agt stop
Checking status of Light Weight Agent:
LW Agent Is Running 11245
Stopping LW AGT…pid 11245
LW AGT Stopped

Check any processes called HPS_Execute_DDMI and HPS_SCANNER. Terminate them

root@linux:~ # ps -ef | grep HPS_Execute_DDMI
root 29866 28462 0 10:32 pts/0 00:00:00 grep HPS_Execute_DDMI

root@linux:~ # ps -ef | grep HPS_SCANNER
root 15424 1 0 Jan10 ? 00:00:09 /opt/hps/inventory/bin/HPS_SCANNER_linux-x86 -p:/var/log/hps/inventory -cfg:/opt/hps/inventory/bin/ddmi-unix-sw.cxz -l:/opt/hps/inventory/temp/local.xsf
root 16608 1 0 2015 ? 00:00:09 /opt/hps/inventory/bin/HPS_SCANNER_linux-x86 -p:/var/log/hps/inventory -cfg:/opt/hps/inventory/bin/ddmi-unix-sw.cxz -l:/opt/hps/inventory/temp/local.xsf
root 29872 28462 0 10:32 pts/0 00:00:00 grep HPS_SCANNER
root@linux:~ # kill -9 16608
root@linux:~ # kill -9 15424

Remove the lock file and start the agent

root@linux:~ # rm /tmp/edscan.lck

root@linux:~ # /etc/init.d/lw_agt start
Starting LW AGT….
Checking status of Light Weight Agent:
LW Agent Is Running 29884

DMESG-UNCLASSIFIED: ia64dsk: The disk for dev_t bc4e0100 appears to have grown since the partition t

Node : hpux.setaoffice.com
Node Type : Itanium 64/32(HTTPS)
Severity : major
OM Server Time: 2016-01-02 21:07:51
Message : DMESG-UNCLASSIFIED: ia64dsk: The disk for dev_t bc4e0100 appears to have grown since the partition table was written.
Msg Group : OS
Application : HPUX_dmesg
Object : dmesg_UNCLASSIFIED
Event Type :
not_found

Instance Name :
not_found

Instruction : No

Checking which disks are alarming

root@hpux:~ # dmesg | grep ia64dsk
ia64dsk: The disk for dev_t bc420100 appears to have grown since the partition table was written.
ia64dsk: The disk for dev_t bc420100 appears to have grown since the partition table was written.
ia64dsk: The disk for dev_t bc420100 appears to have grown since the partition table was written.
ia64dsk: The disk for dev_t bc420100 appears to have grown since the partition table was written.
ia64dsk: The disk for dev_t bc420100 appears to have grown since the partition table was written.
ia64dsk: The disk for dev_t bc420100 appears to have grown since the partition table was written.
ia64dsk: The disk for dev_t bc420100 appears to have grown since the partition table was written.
ia64dsk: The disk for dev_t bc420100 appears to have grown since the partition table was written.
ia64dsk: The disk for dev_t bc430100 appears to have grown since the partition table was written.
ia64dsk: The disk for dev_t bc430100 appears to have grown since the partition table was written.
ia64dsk: The disk for dev_t bc430100 appears to have grown since the partition table was written.
ia64dsk: The disk for dev_t bc430100 appears to have grown since the partition table was written.
ia64dsk: The disk for dev_t bc430100 appears to have grown since the partition table was written.
ia64dsk: The disk for dev_t bc430100 appears to have grown since the partition table was written.
ia64dsk: The disk for dev_t bc430100 appears to have grown since the partition table was written.
ia64dsk: The disk for dev_t bc430100 appears to have grown since the partition table was written.
ia64dsk: The disk for dev_t bc430100 appears to have grown since the partition table was written.

Checking system device

root@hpux:~ # ls -ltraR /dev | grep 420100
crw-r—– 1 bin sys 188 0x420100 Feb 12 2015 c66t0d1
brw-r—– 1 bin sys 31 0x420100 Feb 12 2015 c66t0d1

Checking hardware path

root@hpux:~ # ioscan -kfnC disk /dev/dsk/c66t0d1
Class I H/W Path Driver S/W State H/W Type Description
==================================================================
disk 48 0/2/1/1.10.234.64.0.0.1 sdisk CLAIMED DEVICE 3PARdataVV
/dev/dsk/c66t0d1 /dev/dsk/c66t0d1s2 /dev/rdsk/c66t0d1 /dev/rdsk/c66t0d1s2
/dev/dsk/c66t0d1s1 /dev/dsk/c66t0d1s3 /dev/rdsk/c66t0d1s1 /dev/rdsk/c66t0d1s3

root@hpux:~ # ioscan -m hwpath -H 0/2/1/1.10.234.64.0.0.1
Lun H/W Path Lunpath H/W Path Legacy H/W Path
====================================================================
64000/0xfa00/0x9a
0/2/1/1.0x21230002ac001673.0x4001000000000000 0/2/1/1.10.218.65.0.0.1
0/2/1/1.10.234.64.0.0.1

root@hpux:~ # scsimgr -f replace_wwid -H 64000/0xfa00/0x9a
scsimgr: Successfully validated binding of LUN paths with new LUN.

root@hpux:~ # ls -ltraR /dev | grep 430100
crw-r—– 1 bin sys 188 0x430100 Feb 12 2015 c67t0d1
brw-r—– 1 bin sys 31 0x430100 Feb 12 2015 c67t0d1

root@hpux:~ # ioscan -kfnC disk /dev/dsk/c67t0d1
Class I H/W Path Driver S/W State H/W Type Description
==================================================================
disk 59 0/5/1/0.20.218.64.0.0.1 sdisk CLAIMED DEVICE 3PARdataVV
/dev/dsk/c67t0d1 /dev/dsk/c67t0d1s2 /dev/rdsk/c67t0d1 /dev/rdsk/c67t0d1s2
/dev/dsk/c67t0d1s1 /dev/dsk/c67t0d1s3 /dev/rdsk/c67t0d1s1 /dev/rdsk/c67t0d1s3

root@hpux:~ # ioscan -m hwpath -H 0/5/1/0.20.218.64.0.0.1
Lun H/W Path Lunpath H/W Path Legacy H/W Path
====================================================================
64000/0xfa00/0x9a
0/5/1/0.0x20240002ac001673.0x4001000000000000 0/5/1/0.20.218.64.0.0.1

root@hpux:~ # scsimgr -f replace_wwid -H 64000/0xfa00/0x9a
scsimgr: Successfully validated binding of LUN paths with new LUN.

VxVM vxdiskunsetup ERROR V-5-2-2397 Veritas Disk Name: Device address must be of the form _

When I’m removing a disk from Veritas Volume Manager I was presented the following error message

root@solaris:/ # vxdg -g documentumdg rmdisk documentum01
root@solaris:/ # vxdiskunsetup -C documentum01
VxVM vxdiskunsetup ERROR V-5-2-2397 documentum01: Device address must be of the form _ where
is the logical name of the enclosure to which the
disk belongs
is the logical number of the disk

I need to specify the enclosure name and disk number so that it ran without errors

root@solaris:/ # vxdisk -o alldgs -e list | grep documentum01
dm documentum01 ibm_ds8x000_3134 – 62816000 – – – –

root@solaris:/ # vxdiskunsetup -C ibm_ds8x000_3134

Disk removed on Linux and then LVM commands giving error: read failed after 0 of 4096 at 0: Input/output error

Some LUNs were removed from the server and now LVM commands give error messages

root@linux:~# pvs
/dev/mapper/350002ac4f691374a: read failed after 0 of 4096 at 0: Input/output error
/dev/mapper/350002ac4f691374a: read failed after 0 of 4096 at 107374116864: Input/output error
/dev/mapper/350002ac4f691374a: read failed after 0 of 4096 at 107374174208: Input/output error
/dev/mapper/350002ac4f691374a: read failed after 0 of 4096 at 4096: Input/output error
/dev/vgHP0ascs/lv_ASCS: read failed after 0 of 4096 at 0: Input/output error
/dev/vgHP0ascs/lv_ASCS: read failed after 0 of 4096 at 10737352704: Input/output error
/dev/vgHP0ascs/lv_ASCS: read failed after 0 of 4096 at 10737410048: Input/output error
/dev/vgHP0ascs/lv_ASCS: read failed after 0 of 4096 at 4096: Input/output error
/dev/vgHP0ascs/lv_NOVELL_RemoteLoader: read failed after 0 of 4096 at 0: Input/output error
/dev/vgHP0ascs/lv_NOVELL_RemoteLoader: read failed after 0 of 4096 at 117374976: Input/output error
/dev/vgHP0ascs/lv_NOVELL_RemoteLoader: read failed after 0 of 4096 at 117432320: Input/output error
/dev/vgHP0ascs/lv_NOVELL_RemoteLoader: read failed after 0 of 4096 at 4096: Input/output error
/dev/vgHP0ascs/lv_sapmnt_global: read failed after 0 of 4096 at 0: Input/output error
/dev/vgHP0ascs/lv_sapmnt_global: read failed after 0 of 4096 at 10737352704: Input/output error
/dev/vgHP0ascs/lv_sapmnt_global: read failed after 0 of 4096 at 10737410048: Input/output error
/dev/vgHP0ascs/lv_sapmnt_global: read failed after 0 of 4096 at 4096: Input/output error
/dev/vgHP0ascs/lv_sapmnt_profile: read failed after 0 of 4096 at 0: Input/output error
/dev/vgHP0ascs/lv_sapmnt_profile: read failed after 0 of 4096 at 2147418112: Input/output error
/dev/vgHP0ascs/lv_sapmnt_profile: read failed after 0 of 4096 at 2147475456: Input/output error
/dev/vgHP0ascs/lv_sapmnt_profile: read failed after 0 of 4096 at 4096: Input/output error
/dev/vgHP0ascs/lv_usrsapHP0: read failed after 0 of 4096 at 0: Input/output error
/dev/vgHP0ascs/lv_usrsapHP0: read failed after 0 of 4096 at 7516127232: Input/output error
/dev/vgHP0ascs/lv_usrsapHP0: read failed after 0 of 4096 at 7516184576: Input/output error
/dev/vgHP0ascs/lv_usrsapHP0: read failed after 0 of 4096 at 4096: Input/output error
/dev/vgHP0ascs/lv_sapmnt: read failed after 0 of 4096 at 0: Input/output error
/dev/vgHP0ascs/lv_sapmnt: read failed after 0 of 4096 at 20971454464: Input/output error
/dev/vgHP0ascs/lv_sapmnt: read failed after 0 of 4096 at 20971511808: Input/output error
/dev/vgHP0ascs/lv_sapmnt: read failed after 0 of 4096 at 4096: Input/output error
/dev/vgHP0ascs/lv_sapmnt_exe: read failed after 0 of 4096 at 0: Input/output error
/dev/vgHP0ascs/lv_sapmnt_exe: read failed after 0 of 4096 at 5368643584: Input/output error
/dev/vgHP0ascs/lv_sapmnt_exe: read failed after 0 of 4096 at 5368700928: Input/output error
/dev/vgHP0ascs/lv_sapmnt_exe: read failed after 0 of 4096 at 4096: Input/output error
PV VG Fmt Attr PSize PFree
/dev/sda2 vgroot lvm2 a– 99.50g 38.74g

Check with the command dmsetup the device that needs to be removed and then remove it. I do it from bottom to top

root@linux:~# dmsetup ls | grep lv_sapmnt_exe
vgHP0ascs-lv_sapmnt_exe (253:21)
root@linux:~# dmsetup remove vgHP0ascs-lv_sapmnt_exe

root@linux:~# dmsetup ls | grep lv_sapmnt
vgHP0ascs-lv_sapmnt_profile (253:18)
vgHP0ascs-lv_sapmnt_global (253:17)
vgHP0ascs-lv_sapmnt (253:20)
root@linux:~# dmsetup remove vgHP0ascs-lv_sapmnt

root@linux:~# dmsetup ls | grep lv_usrsapHP0
vgHP0ascs-lv_usrsapHP0 (253:19)
root@linux:~# dmsetup remove vgHP0ascs-lv_usrsapHP0

root@linux:~# dmsetup ls | grep lv_sapmnt_profile
vgHP0ascs-lv_sapmnt_profile (253:18)
root@linux:~# dmsetup remove vgHP0ascs-lv_sapmnt_profile

root@linux:~# dmsetup ls | grep lv_sapmnt_global
vgHP0ascs-lv_sapmnt_global (253:17)
root@linux:~# dmsetup remove vgHP0ascs-lv_sapmnt_global

root@linux:~# dmsetup ls | grep lv_NOVELL_RemoteLoader
vgHP0ascs-lv_NOVELL_RemoteLoader (253:16)
root@linux:~# dmsetup remove vgHP0ascs-lv_NOVELL_RemoteLoader

root@linux:~# dmsetup ls | grep lv_ASCS
vgHP0ascs-lv_ASCS (253:15)
root@linux:~# dmsetup remove vgHP0ascs-lv_ASCS

root@linux:~# dmsetup ls | grep 350002ac4f691374a
350002ac4f691374a (253:14)
root@linux:~# dmsetup remove 350002ac4f691374a

Then after issuing pvs, it won’t show error messages anymore

root@linux:~# pvs
PV VG Fmt Attr PSize PFree
/dev/sda2 vgroot lvm2 a– 99.50g 38.74g

DBSPI20-1: The configuration file /var/opt/OV/dbspi/dbtab does not exist. Please configure this node

Node : linux.setaoffice.com
Node Type : Intel/AMD x64(HTTPS)
Severity : major
OM Server Time: 2015-12-22 18:30:07
Message : DBSPI20-1: The configuration file /var/opt/OV/dbspi/dbtab does not exist. Please configure this node.
Msg Group : DBSPI
Application : Oracle
Object : dbspicao
Event Type : DBSPI
Instance Name :
not_found

Instruction : DBSPI20-1: The configuration file does not exist. Please configure this node.

Probable Cause: The monitored node has not been configured for DB-SPI, or the node to which
the DB-SPI metric templates were distributed has no databases to monitor.

Suggested Action: Use the DBSPI Config application in the IT/O Application Bank
to configure one or more databases to monitor as explained in the DB-SPI User’s
Guide.

If you do not intend to monitor databases on this system, modify the IT/O template assignment
to exclude the node from DBSPI template distribution.

Troubleshooting Tips in DBMon User Guide [ftp://internalftp.setaoffice.com/DBMON/Documents/DBMon-User_Guide.pdf] may provide more detailed suggestions.

In this case you need to create a configuration file /var/opt/OV/dbspi/dbtab and then set the parameters according to your organization.

You can also remove the monitoring if you don’t have a database on the server