Category: System Controller

Where is the platform message log on a Sun Fire E25K?

sms-svc@systemcontroller-sc0:~ $ setfailover force
Forcing failover. Do you want to continue (yes/no)? yes
setfailover: Unable to force a failover: Internal error – refer to the platform message log using the following code: 8622

To check the platform message log on a Sun Fire E25K, check the file $SMSVAR/adm/platform/messages

How to perform a Sun SC Failover

Check if there is any data synchronization running between the System Controllers. The File Propagation State must be set as active and there must not show any file in Active File or Queued Files.

e20k-sc0:sms-svc> showdatasync
File Propagation State: ACTIVE
Active File:            –
Queued Files:           0

Each system controller must not have any status different from Good to prevent the failover. The failover must be active.

e20k-sc0:sms-svc> showfailover -v
SC Failover Status:     ACTIVE
Status of Shared Memory:
HASRAM (CSB at CS0):     …………………………………Good
HASRAM (CSB at CS1):     …………………………………Good

Status of e20k-sc0:
Role:                    …………………………………MAIN
SMS Daemons:             …………………………………Good
System Clock:            …………………………………Good
Private I2 Network:      …………………………………Good
Private HASRAM Network:  …………………………………Good
Public Network:
Group “C1”:      …………………………………..Up
eri0:              …………………………………..Up
eri3:              …………………………………..Up
Logical IP Addr. – C1:…………………………………..Up
System Memory:           ………………………………….4.9%
Disk Status:
/:                   ………………………………….3.6%
Console Bus Status:
EXB at EX0:          …………………………………Good
EXB at EX1:          …………………………………Good
EXB at EX2:          …………………………………Good
EXB at EX3:          …………………………………Good
EXB at EX4:          …………………………………Good
EXB at EX5:          …………………………………Good
EXB at EX6:          …………………………………Good
EXB at EX8:          …………………………………Good
EXB at EX9:          …………………………………Good
EXB at EX10:         …………………………………Good
EXB at EX11:         …………………………………Good
EXB at EX12:         …………………………………Good
EXB at EX13:         …………………………………Good
EXB at EX14:         …………………………………Good
EXB at EX15:         …………………………………Good
EXB at EX16:         …………………………………Good
EXB at EX17:         …………………………………Good

Status of e20k-sc1:
Role:                    ………………………………..SPARE
SMS Daemons:             …………………………………Good
System Clock:            …………………………………Good
Private I2 Network:      …………………………………Good
Private HASRAM Network:  …………………………………Good
Public Network:
Group “C1”:      …………………………………..Up
eri0:              …………………………………..Up
eri3:              …………………………………..Up
Logical IP Addr. – C1:……………………………..Inactive
System Memory:           ………………………………….4.6%
Disk Status:
/:                   ………………………………….3.5%
Console Bus Status:
EXB at EX0:          …………………………………Good
EXB at EX1:          …………………………………Good
EXB at EX2:          …………………………………Good
EXB at EX3:          …………………………………Good
EXB at EX4:          …………………………………Good
EXB at EX5:          …………………………………Good
EXB at EX6:          …………………………………Good
EXB at EX8:          …………………………………Good
EXB at EX9:          …………………………………Good
EXB at EX10:         …………………………………Good
EXB at EX11:         …………………………………Good
EXB at EX12:         …………………………………Good
EXB at EX13:         …………………………………Good
EXB at EX14:         …………………………………Good
EXB at EX15:         …………………………………Good
EXB at EX16:         …………………………………Good
EXB at EX17:         …………………………………Good

You run setfailover force to perform the failover. If there is a problem on the spare clock input, it can cause a domain stop (it is going to power off all the domains)

e20k-sc0:sms-svc> setfailover force
Forcing failover. Do you want to continue (yes/no)? yes
The spare clock input on some boards might be bad. Forcing a failover now is likely to cause the affected domains to domain stop (Dstop).
Do you want to continue (yes/no)? yes

Checking service status

e20k-sc0:sms-svc> showfailover -r
SPARE

When you manually failover, the automatic failover is disabled

e20k-sc0:sms-svc> showfailover –v | grep “SC Failover Status”
SC Failover Status:     DISABLED

You need to manually activate the automatic failover again to return to the previous status

e20k-sc0:sms-svc> setfailover on
e20k-sc0:sms-svc> showfailover -v | grep “SC Failover Status”
SC Failover Status:     ACTIVATING

e20k-sc1:sms-svc> showfailover -v | grep “SC Failover Status”
SC Failover Status:     ACTIVE

Sun System Management Services – Error: Exclusive session is in use, disconnecting.

You need to connect to a System Controller (SC) to manage a domain in a Sun Fire E12K, E15K, E20K or E25K. It has the System Management Services (SMS) software that allows you to control the domains.

To view the console, you use /opt/SUNWSMS/bin/console -d <domain letter or domain name> but I was having the error message “Exclusive session is in use, disconnecting”.

sms-svc@sc0:/ $ console -d domain06
Trying to connect…
Connected to Domain Server.

Exclusive session is in use, disconnecting.

I asked to a friend why I was having this problem and he said that I needed to kill the other session.

sms-svc@sc0:/ $ w
9:24am  up 76 day(s),  3:53,  2 users,  load average: 0.91, 2.04, 1.52
User     tty           login@  idle   JCPU   PCPU  what
root     pts/1         9:08am     9                console -d domain06
root     pts/2         9:19am                      w

I found out which PID is running the console, then killed it.

sms-svc@sc0:/ $ ps -ef | grep consol
root  3127     1  0   Sep 19 console  0:00 /usr/lib/saf/ttymon -g -h -p sc0-01-vix console login:  -T sun -d /dev/console
sms-svc 11229 11182  0 09:24:18 pts/2    0:00 grep consol
sms-svc  5492 5398  0 09:09:23 pts/1    0:00 console -d domain06

sms-svc@sc0:/ $ kill 5492

Worked as expected:

sms-svc@sc0:/ $ /opt/SUNWSMS/bin/console -d domain06
Trying to connect…
Connected to Domain Server.
Your console is in exclusive mode now.