Red Hat Linux 5: qla2xxx 0000:05:00.0: scsi(3:4:0): Abort command issued — 1 40e 2002

The server had two single port HBA cards that were receiving a lot of reset commands

root@linux:~ # lspci | grep -i fibre
05:00.0 Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI Express HBA (rev 03)
08:00.0 Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI Express HBA (rev 03)
0b:00.0 Fibre Channel: Emulex Corporation Zephyr-X LightPulse Fibre Channel Host Adapter (rev 02)
0b:00.1 Fibre Channel: Emulex Corporation Zephyr-X LightPulse Fibre Channel Host Adapter (rev 02)

Jun 12 04:11:47 linnux kernel: qla2xxx 0000:08:00.0: Performing ISP error recovery – ha= ffff8102abac04f8.
Jun 12 04:11:47 linnux kernel: qla2xxx 0000:05:00.0: scsi(3:4:0): Abort command issued — 1 40e 2002.
Jun 12 04:11:47 linnux kernel: qla2xxx 0000:05:00.0: scsi(3:4:0): LOOP RESET ISSUED.
Jun 12 04:11:48 linnux kernel: qla2xxx 0000:05:00.0: qla2xxx_eh_bus_reset: reset succeeded
Jun 12 04:11:48 linnux kernel: qla2xxx 0000:08:00.0: LIP reset occured (f700).
Jun 12 04:11:48 linnux kernel: qla2xxx 0000:08:00.0: LIP occured (f700).
Jun 12 04:11:48 linnux kernel: qla2xxx 0000:08:00.0: LIP reset occured (f7f7).
Jun 12 04:11:48 linnux kernel: qla2xxx 0000:08:00.0: LOOP UP detected (4 Gbps).
Jun 12 04:11:50 linnux kernel: qla2xxx 0000:08:00.0: qla2xxx_eh_host_reset: reset succeeded
Jun 12 04:11:54 linnux kernel: lpfc 0000:0b:00.1: 1:(0):0713 SCSI layer issued Device Reset (11, 0) return x2002
Jun 12 04:12:01 linnux kernel: scsi 1:0:5:0: scsi: Device offlined – not ready after error recovery
Jun 12 04:12:01 linnux kernel: scsi 1:0:5:0: timing out command, waited 22s

We had opened a ticket with Red Hat and they said to check the hardware.

The BUR team was asked to check the tape library and storage team was asked to check the SAN switch. No problems were found.

So we logged a ticket with HP to replace the system board. Two different HBA cards in different PCI slots were having problems.

Replaced system board and riser card.

Jun 16 12:02:49 linnux kernel: lpfc 0000:0b:00.0: 0:(0):0713 SCSI layer issued Device Reset (3, 0) return x2002
Jun 16 12:02:59 linnux kernel: lpfc 0000:0b:00.0: 0:(0):0714 SCSI layer issued Bus Reset Data: x2002
Jun 16 12:03:20 linnux kernel: lpfc 0000:0b:00.0: 0:3172 SCSI layer issued Host Reset Data: x2002
Jun 16 12:03:20 linnux kernel: lpfc 0000:0b:00.0: 0:1303 Link Up Event x1 received Data: x1 xf7 x10 x9 x0 x0 0
Jun 16 12:03:40 linnux kernel: scsi 1:0:3:0: scsi: Device offlined – not ready after error recovery
Jun 16 12:03:40 linnux kernel: scsi 1:0:3:0: timing out command, waited 22s
Jun 16 12:03:55 linnux kernel: lpfc 0000:0b:00.0: 0:(0):0713 SCSI layer issued Device Reset (4, 0) return x2002

But the problem still persisted.

Some parameters in the SAN switch were changed and the problem was solved.

Advertisement