Advertisements

Tag Archives: multipath

One path missing in disk map on multipath device

Showing a particular case:

The disk mpath5 was only showing one path

root@linux:~ # multipath -ll mpath5
mpath5 (350002ac19430374a) dm-17 3PARdata,VV
[size=47G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][active]
\_ 2:0:0:1 sdh 8:112 [active][ready]

The disk used by operating system is cciss/c0d0

root@linux:~ # pvs
PV VG Fmt Attr PSize PFree
/dev/cciss/c0d0p3 vg00 lvm2 a– 269.47G 203.28G
/dev/mpath/350002ac19429374a vgapp lvm2 a– 100.00G 0
/dev/mpath/350002ac1942c374a vgapp lvm2 a– 20.00G 0
/dev/mpath/350002ac1942e374a vgapp lvm2 a– 75.00G 0
/dev/mpath/350002ac1942f374a vgapp lvm2 a– 158.00G 0
/dev/mpath/350002ac19430374a vgapp lvm2 a– 47.00G 996.00M
/dev/mpath/350002ac22869374a vgapp lvm2 a– 100.00G 0
/dev/mpath/350002ac2286a374a vgapp lvm2 a– 40.00G 0

Listing the SCSI devices. sda through sdn are used

root@linux:~ # lsscsi
[1:0:0:1] disk 3PARdata VV 3213 /dev/sda
[1:0:0:2] disk 3PARdata VV 3213 /dev/sdb
[1:0:0:3] disk 3PARdata VV 3213 /dev/sdc
[1:0:0:4] disk 3PARdata VV 3213 /dev/sdd
[1:0:0:5] disk 3PARdata VV 3213 /dev/sde
[1:0:0:6] disk 3PARdata VV 3213 /dev/sdf
[1:0:0:7] disk 3PARdata VV 3213 /dev/sdg
[1:0:0:254] enclosu 3PARdata SES 3213 –
[2:0:0:1] disk 3PARdata VV 3213 /dev/sdh
[2:0:0:2] disk 3PARdata VV 3213 /dev/sdi
[2:0:0:3] disk 3PARdata VV 3213 /dev/sdj
[2:0:0:4] disk 3PARdata VV 3213 /dev/sdk
[2:0:0:5] disk 3PARdata VV 3213 /dev/sdl
[2:0:0:6] disk 3PARdata VV 3213 /dev/sdm
[2:0:0:7] disk 3PARdata VV 3213 /dev/sdn
[2:0:0:254] enclosu 3PARdata SES 3213 –

Checking /etc/multipath.conf. sda was being blacklisted. Commented the line

root@linux:~ # grep -v ^# /etc/multipath.conf

blacklist {
devnode “^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*”
devnode “^hd[a-z][[0-9]*]”
devnode “^hd[a-z]”
#devnode “^sda$”
}

defaults {
user_friendly_names yes
}

Running multipath -v3

root@linux:~ # multipath -v3

Checking disk mpath5

root@linux:~ # multipath -ll mpath5
mpath5 (350002ac19430374a) dm-17 3PARdata,VV
[size=47G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][active]
\_ 1:0:0:1 sda 8:0 [active][ready]
\_ 2:0:0:1 sdh 8:112 [active][ready]

Advertisements

Adding new disks to Oracle ASM (Automatic Storage Management)

I have a two node Oracle RAC and I need to add disks to it

Storage team presented 3 new LUNs

rac1 & rac2
RAC1RAC2_PROJECT335066_Data1 (Tier1, RAID5, Size: 50GB)
World Wide LUN ID# 6001-4380-05de-d87b-0000-5000-10ef-0000
RAC1RAC2_PROJECT335066_Logs1 (Tier1, RAID1, Size: 20GB)
World Wide LUN ID# 6001-4380-05de-d87b-0000-5000-10f3-0000
RAC1RAC2_PROJECT335066_Quo (Tier1, RAID5, Size: 1GB)
World Wide LUN ID# 6001-4380-05de-d87b-0000-5000-10f7-0000

10ef
10f3
10f7

My server has a fibre-channel HBA card. Check HBA information. Install libsysfs and sysfsutils if you don’t have systool. Run yum install sysfsutils. I want to use an internal repository so I won’t use yum

root@rac1# systool -av -c fc_host
-bash: systool: command not found

root@rac1:~ # rpm -ivh http://172.22.19.185/rhel/redhat/rhel-x86_64-server-6/getPackage/libsysfs-2.1.0-7.el6.x86_64.rpm http://172.22.19.185/rhel/redhat/rhel-x86_64-server-6/getPackage/sysfsutils-2.1.0-7.el6.x86_64.rpm
Retrieving http://172.22.19.185/rhel/redhat/rhel-x86_64-server-6/getPackage/libsysfs-2.1.0-7.el6.x86_64.rpm
Retrieving http://172.22.19.185/rhel/redhat/rhel-x86_64-server-6/getPackage/sysfsutils-2.1.0-7.el6.x86_64.rpm
Preparing… ########################################### [100%]
1:libsysfs ########################################### [ 50%]
2:sysfsutils ########################################### [100%]

root@rac1:~ # systool -av -c fc_host | grep “Class Device =” | awk -F’=’ {‘print $2’} | awk -F'”‘ {‘print “echo \”- – -\” > /sys/class/scsi_host/”$2″/scan”‘}
echo “- – -” > /sys/class/scsi_host/host0/scan
echo “- – -” > /sys/class/scsi_host/host1/scan

root@rac1:~ # systool -av -c fc_host | grep “Class Device =” | awk -F’=’ {‘print $2’} | awk -F'”‘ {‘print “echo \”- – -\” > /sys/class/scsi_host/”$2″/scan”‘} | bash

Listing the disks under Oracle ASM

root@rac1:~ # /etc/init.d/oracleasm listdisks
OCR_VOTE_001
OCR_VOTE_002
OCR_VOTE_003
OCR_VOTE_004
OCR_VOTE_005
ORAARCH_001
ORADATA_001

Added the disk information to /etc/multipath.conf

multipath {
wwid 36001438005ded87b0000500010ef0000
alias asmdisk07
}
multipath {
wwid 36001438005ded87b0000500010f30000
alias asmdisk08
}
multipath {
wwid 36001438005ded87b0000500010f70000
alias votdisk04
}

Then run multipath -r to reload the new multipath aliases

Check if the aliases changed

root@rac1:~ # multipath -ll | grep 10ef
asmdisk07 (36001438005ded87b0000500010ef0000) dm-44 HP,HSV450
root@rac1:~ # multipath -ll | grep 10f3
asmdisk08 (36001438005ded87b0000500010f30000) dm-45 HP,HSV450
root@rac1:~ # multipath -ll | grep 10f7
votdisk04 (36001438005ded87b0000500010f70000) dm-46 HP,HSV450

DBA team asked to change owner and group for the disk devices

root@rac1:~ # chown oracle:dba /dev/mapper/asmdisk07
root@rac1:~ # chown oracle:dba /dev/mapper/asmdisk08
root@rac1:~ # chown oracle:dba /dev/mapper/votdisk04

Labeling disks

root@rac1:~ # /etc/init.d/oracleasm createdisk OCR_VOTE_006 votdisk04
Marking disk “OCR_VOTE_006” as an ASM disk: [ OK ]
root@rac1:~ # /etc/init.d/oracleasm createdisk ORADATA_002 asmdisk07
Marking disk “ORADATA_002” as an ASM disk: [ OK ]
root@rac1:~ # /etc/init.d/oracleasm createdisk ORADATA_003 asmdisk08
Marking disk “ORADATA_003” as an ASM disk: [ OK ]

Checking if the new disks

root@rac1:~ # /etc/init.d/oracleasm listdisks
OCR_VOTE_001
OCR_VOTE_002
OCR_VOTE_003
OCR_VOTE_004
OCR_VOTE_005
ORAARCH_001
ORADATA_001
ORADATA_002
ORADATA_003

In the other node, the disks are not updated automatically

root@rac2:~ # /etc/init.d/oracleasm listdisks
OCR_VOTE_001
OCR_VOTE_002
OCR_VOTE_003
OCR_VOTE_004
OCR_VOTE_005
ORAARCH_001
ORADATA_001

You need to perform the same steps and instead of running oracleasm disk, run oracleasm scandisks

root@rac2:~ # /etc/init.d/oracleasm scandisks
Scanning the system for Oracle ASMLib disks: [ OK ]

root@rac2:~ # /etc/init.d/oracleasm listdisks
OCR_VOTE_001
OCR_VOTE_002
OCR_VOTE_003
OCR_VOTE_004
OCR_VOTE_005
ORAARCH_001
ORADATA_001
ORADATA_002
ORADATA_003

Clustered Linux server showing device-mapper: multipath: Failing path in /var/log/messages

I have a disk presented to 4 servers.

Everyday we receive a notification saying that a specific multipathed disk lost all paths.

The disk is showing a SCSI reservation conflict – SCSI persistent reservations provide the capability to control the access of each node to shared storage devices

May 11 13:35:04 linux kernel: sd 0:0:0:38: reservation conflict
May 11 13:35:04 linux kernel: sd 0:0:0:38: [sdag] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
May 11 13:35:04 linux kernel: sd 0:0:0:38: [sdag] CDB: Write(10): 2a 00 00 00 14 50 00 00 08 00

May 11 13:35:04 linux kernel: end_request: critical nexus error, dev sdag, sector 5200
May 11 13:35:04 linux kernel: device-mapper: multipath: Failing path 66:0. <————————sdag
May 11 13:35:04 linux kernel: sd 1:0:0:38: reservation conflict
May 11 13:35:04 linux kernel: sd 1:0:0:38: [sdeh] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
May 11 13:35:04 linux kernel: sd 1:0:0:38: [sdeh] CDB: Write(10): 2a 00 00 00 14 50 00 00 08 00
May 11 13:35:04 linux kernel: end_request: critical nexus error, dev sdeh, sector 5200
May 11 13:35:04 linux kernel: device-mapper: multipath: Failing path 128:144. <———————-sdeh
May 11 13:35:04 linux multipathd: 66:0: mark as failed
May 11 13:35:04 linux multipathd: PP0_oraarch_disk_001: remaining active paths: 3

May 11 13:35:04 linux kernel: sd 0:0:1:38: reservation conflict
May 11 13:35:04 linux kernel: sd 0:0:1:38: [sdcc] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
May 11 13:35:04 linux kernel: sd 0:0:1:38: [sdcc] CDB: Write(10): 2a 00 00 00 14 50 00 00 08 00
May 11 13:35:04 linux kernel: end_request: critical nexus error, dev sdcc, sector 5200
May 11 13:35:04 linux kernel: device-mapper: multipath: Failing path 69:0.<—————————sdcc
May 11 13:35:04 linux kernel: sd 1:0:1:38: reservation conflict
May 11 13:35:04 linux kernel: sd 1:0:1:38: [sdgg] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
May 11 13:35:04 linux kernel: sd 1:0:1:38: [sdgg] CDB: Write(10): 2a 00 00 00 14 50 00 00 08 00
May 11 13:35:04 linux kernel: end_request: critical nexus error, dev sdgg, sector 5200
May 11 13:35:04 linux kernel: device-mapper: multipath: Failing path 131:192.<————————-sdgg
May 11 13:35:04 linux kernel: end_request: critical nexus error, dev dm-209, sector 5200

May 11 13:35:05 linux multipathd: 128:144: mark as failed
May 11 13:35:05 linux multipathd: PP0_oraarch_disk_001: remaining active paths: 2
May 11 13:35:05 linux multipathd: 69:0: mark as failed
May 11 13:35:05 linux multipathd: PP0_oraarch_disk_001: remaining active paths: 1
May 11 13:35:05 linux multipathd: 131:192: mark as failed

PP0_oraarch_disk_001 (350002ad05071374b) dm-209 3PARdata,VV
size=300G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=0 status=active
|- 0:0:0:38 sdag 66:0 active undef running
|- 1:0:0:38 sdeh 128:144 active undef running
|- 0:0:1:38 sdcc 69:0 active undef running
`- 1:0:1:38 sdgg 131:192 active undef running

Since this is opening support tickets, I removed monitoring for this disk

LVM – Disk removed and appears message read failed after 0 of 4096 at 4096 or similar

In this server when I ran pvs it was showing a lot of input/output errors

root@linux:~ # pvs
/dev/sdq: read failed after 0 of 4096 at 0: Input/output error
/dev/sdq1: read failed after 0 of 2048 at 0: Input/output error
/dev/mpath/disk2: read failed after 0 of 4096 at 80530571264: Input/output error
/dev/mpath/disk2: read failed after 0 of 4096 at 80530628608: Input/output error
/dev/mpath/disk2: read failed after 0 of 4096 at 0: Input/output error
/dev/mpath/disk2: read failed after 0 of 4096 at 4096: Input/output error
/dev/mpath/disk2p1: read failed after 0 of 512 at 80525328384: Input/output error
/dev/mpath/disk2p1: read failed after 0 of 512 at 80525447168: Input/output error
/dev/mpath/disk2p1: read failed after 0 of 512 at 0: Input/output error
/dev/mpath/disk2p1: read failed after 0 of 512 at 4096: Input/output error
/dev/mpath/disk2p1: read failed after 0 of 2048 at 0: Input/output error
/dev/devvg/u01lv: read failed after 0 of 4096 at 21474770944: Input/output error
/dev/devvg/u01lv: read failed after 0 of 4096 at 21474828288: Input/output error
/dev/devvg/u01lv: read failed after 0 of 4096 at 0: Input/output error
/dev/devvg/u01lv: read failed after 0 of 4096 at 4096: Input/output error
/dev/sde: read failed after 0 of 4096 at 0: Input/output error
/dev/sde1: read failed after 0 of 2048 at 0: Input/output error
/dev/sdw: read failed after 0 of 4096 at 0: Input/output error
/dev/sdw1: read failed after 0 of 2048 at 0: Input/output error
/dev/sdk: read failed after 0 of 4096 at 0: Input/output error

A disk was removed and it was not removed cleanly

root@linux:~ # multipath -ll disk2
sde: checker msg is “tur checker reports path is down”
sdk: checker msg is “tur checker reports path is down”
sdq: checker msg is “tur checker reports path is down”
sdw: checker msg is “tur checker reports path is down”
disk2 (360000000000000000000000000000000) dm-25 HP,P2000 G3 FC
[size=75G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:0:5 sde 8:64 [failed][faulty]
\_ 1:0:1:5 sdk 8:160 [failed][faulty]
\_ 2:0:0:5 sdq 65:0 [failed][faulty]
\_ 2:0:1:5 sdw 65:96 [failed][faulty]

Removed the block devices

root@linux:~ # echo 1 > /sys/block/sde/device/delete
root@linux:~ # echo 1 > /sys/block/sdk/device/delete
root@linux:~ # echo 1 > /sys/block/sdq/device/delete
root@linux:~ # echo 1 > /sys/block/sdw/device/delete

I was able to remove the device that device-mapper used but I was not able to remove the device with the partition

root@linux:~ # dmsetup remove disk2
root@linux:~ # dmsetup remove disk2p1
device-mapper: remove ioctl failed: Device or resource busy
Command failed

I had a filesystem using the device /dev/devvg/u01lv so I had to remove it before I was able to remove the device with the partition

root@linux:~ # dmsetup remove devvg-u01lv

root@linux:~ # dmsetup remove disk2p1

UXMON: mpathb – Only one path detected, no path redundancy

Node : linux.setaoffice.com
Node Type : Intel/AMD x64(HTTPS)
Severity : major
OM Server Time: 2015-10-14 12:39:19
Message : UXMON: mpathb – Only one path detected, no path redundancy
Msg Group : OS
Application : mpmon
Object : mp
Event Type :
not_found

Instance Name :
not_found

Instruction : <child CI outage:no> <parent CI outage:no> The multipathd -k”show map $device topology” command shows more details

Please check /var/opt/OV/log/OpC/mp_mon.log for more details

Checking the log file it complains about the mpathb

root@linux:~ # cat /var/opt/OV/log/OpC/mp_mon.log
Wed Oct 14 13:39:13 2015 : INFO : UXMONmpmon is running now, pid=21954
Wed Oct 14 13:39:13 2015 : Major: mpathb – Only one path detected, no path redundancy
Wed Oct 14 13:39:13 2015 : INFO : UXMONmpmon end, pid=21954
Wed Oct 14 13:56:12 2015 : INFO : UXMONmpmon is running now, pid=29130
Wed Oct 14 13:56:12 2015 : Major: mpathb – Only one path detected, no path redundancy
Wed Oct 14 13:56:12 2015 : INFO : UXMONmpmon end, pid=29130
Wed Oct 14 14:13:13 2015 : INFO : UXMONmpmon is running now, pid=36813
Wed Oct 14 14:13:13 2015 : Major: mpathb – Only one path detected, no path redundancy
Wed Oct 14 14:13:13 2015 : INFO : UXMONmpmon end, pid=36813
Wed Oct 14 14:30:13 2015 : INFO : UXMONmpmon is running now, pid=44029
Wed Oct 14 14:30:13 2015 : Major: mpathb – Only one path detected, no path redundancy
Wed Oct 14 14:30:13 2015 : INFO : UXMONmpmon end, pid=44029
Wed Oct 14 14:47:12 2015 : INFO : UXMONmpmon is running now, pid=51897
Wed Oct 14 14:47:13 2015 : INFO : UXMONmpmon end, pid=51897
Wed Oct 14 15:04:12 2015 : INFO : UXMONmpmon is running now, pid=58833
Wed Oct 14 15:04:12 2015 : INFO : UXMONmpmon end, pid=58833

In this server it is a local disk so it was added to the multipath blacklist

root@linux:~ # vi /etc/multipath.conf
blacklist {
devnode “^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*”
devnode “^hd[a-z]”
devnode “^sd[ab]$”
devnode “^cciss!c[0-9]d[0-9]*”
}

If you are in a VMware host, you can safely disable this module.

root@linux:~ # cp /var/opt/OV/bin/instrumentation/mp_mon.cfg /var/opt/OV/conf/OpC/

In the configuration file /var/opt/OV/conf/OpC/mp_mon.cfg set disable to yes

root@linux:~ # vi /var/opt/OV/conf/OpC/mp_mon.cfg
disable = yes

multipath: /sbin/scsi_id exitted with 1 – cannot get the the wwid for cciss!c0d0

Whenever you run multipath and shows the message cannot get the the wwid for cciss!c0d0

root@linux:~ # multipath -ll oradisk004
/sbin/scsi_id exitted with 1
cannot get the the wwid for cciss!c0d0
oradisk004 (360060e800573b800000073b8000012d2) dm-12 HP,OPEN-V
[size=50G][features=1 queue_if_no_path][hwhandler=1 hp-sw][rw]
\_ round-robin 0 [prio=8][active]
\_ 1:0:0:3 sdd 8:48 [active][ready]
\_ 2:0:0:3 sdh 8:112 [active][ready]

Edit file /etc/multipath.conf and verify if you are blacklisting the cciss drive

blacklist {
devnode “^cciss!c[0-9]d[0-9]*”
}

%d bloggers like this: