Category: HPOM

HPOM – UXMON: File /var/log/messages age exceeds 1d threshold.

Removed file /var/opt/OV/conf/OpC/act_mon.cfg that had some configuration about the file /var/log/messages

################################################################################
#
# The intention of this script is to monitor the last modification time of a
# file, or to monitor its size. This is used to supervise other programs or
# scripts which have to write regularly to their logfile. If a program or a
# script doesn’t modify “its” file, there is probably something wrong with this process.
#
# If the configured interval is exceeded for the file which is intended to be
# monitored, or if the size is above or below the configured limit (depending
# on whether the size threshold has a < or > modifier),
# a log-message is written
#
#
################################################################################

[LINUX]
/var/log/cron 2d WARNING 0000-2400 * TT_LINUX
/var/log/messages 1d warning 0000-2400 * TT_LINUX

HPOM certificate request: terminate called after throwing an instance of ‘char const*’

Triggered the certificate request but gave the following error

root@linux:~ # ovcert –certreq
terminate called after throwing an instance of ‘char const*’
Aborted

Stop HPOM and remove the files from the directories shown below. Verify if there is a HPOM agent process running and then kill if there is.

root@linux:~ # /opt/OV/bin/ovc -kill
root@linux:~ # /opt/OV/bin/opcagt -kill
(ctrl-111) Ovcd is not yet started.

root@linux:~ # rm /var/opt/OV/tmp/OpC/*
rm: cannot remove `/var/opt/OV/tmp/OpC/bin’: Is a directory
rm: cannot remove `/var/opt/OV/tmp/OpC/conf’: Is a directory
root@linux:~ # rm /var/opt/OV/tmp/public/OpC/*
rm: cannot remove `/var/opt/OV/tmp/public/OpC/*’: No such file or directory
root@linux:~ # rm /var/opt/OV/tmp/*.pid

root@linux:~ # ps -ef |grep -i opc
root@linux:~ # ps -ef |grep -i ov

Wait 3 minutes then start the agent

root@linux:~ # sleep 180

root@linux:~ # /opt/OV/bin/opcagt -start
root@linux:~ # /opt/OV/bin/ovc -start -debug

root@linux:~ # /opt/OV/bin/opcagt -status
scopeux Perf Agent data collector (23366) Running
midaemon Measurement Interface daemon (23372) Running
ttd ARM registration daemon (23354) Running
perfalarm Alarm generator (23421) Running
coda OV Performance Core COREXT (23414) Running
opcacta OVO Action Agent AGENT,EA (23565) Running
opcmsga OVO Message Agent AGENT,EA (23542) Running
ovbbccb OV Communication Broker CORE (23395) Running
ovcd OV Control CORE (23387) Running
ovconfd OV Config and Deploy COREXT (23507) Running
Message Agent is not buffering.

root@linux:~ # /opt/OV/bin/ovc -status
coda OV Performance Core COREXT (23414) Running
opcacta OVO Action Agent AGENT,EA (23565) Running
opcmsga OVO Message Agent AGENT,EA (23542) Running
ovbbccb OV Communication Broker CORE (23395) Running
ovcd OV Control CORE (23387) Running
ovconfd OV Config and Deploy COREXT (23507) Running

root@linux:~ # /opt/perf/bin/ovpa start

The Perf Agent scope collector is being started.
The ARM registration daemon ttd is already running.
It will be signaled to reprocess its configuration file.

The Performance Collector daemon
/opt/perf/bin/scopeux, is already running.

The coda daemon /opt/OV/lbin/perf/coda is already running.
The alarm generator /opt/perf/bin/perfalarm is already running.
It is signaled to reprocess its alarm definitions.

root@linux:~ # /opt/perf/bin/ovpa status
Perf Agent status:
Running scopeux (Perf Agent data collector) pid 23366
Running midaemon (Measurement Interface daemon) pid 23372
Running ttd (ARM registration daemon) pid 23354

Perf Agent Server status:

Running ovcd (OV control component) pid 23387
Running ovbbccb (BBC5 communication broker) pid 23395
Running coda (perf component) pid(s) 23414
Running perfalarm (alarm generator) pid(s) 23421

root@linux:~ # ovc -status
coda OV Performance Core COREXT (23414) Running
opcacta OVO Action Agent AGENT,EA (23565) Running
opcmsga OVO Message Agent AGENT,EA (23542) Running
ovbbccb OV Communication Broker CORE (23395) Running
ovcd OV Control CORE (23387) Running
ovconfd OV Config and Deploy COREXT (23507) Running

root@linux:~ # ovcert -certreq
INFO: Certificate request has been successfully triggered.

HPOM: Flood Gate has Detected a Storm for Application

Contact Support Team or Technical Lead. Flood Gate has Detected a Storm for Application (CLONED_LINUX.SETAOFFICE.COM^:^ntpmon^:^) Logfile: None Annotations: No

Node : cloned_linux.setaoffice.com
Node Type : Intel/AMD x86(HTTPS)
Severity : major
OM Server Time: 2014-03-21 10:02:21
Message : Contact Support Team or Technical Lead. Flood Gate has Detected a Storm for Application (CLONED_LINUX.SETAOFFICE.COM^:^ntpmon^:^) Logfile: None Annotations: No
Msg Group : OS
Application : esf
Object : Event Storm
Event Type :
not_found

Instance Name :
not_found

Instruction : No

This error was appearing in my case because this server was cloned and probably it has the same certificate as the cloned machine.

I reinstalled HPOM.

HP OpenView Error: ovomanagementserver.hp.com: (bbc-289) status=eSSLError time= (above 10000 ms)

The HTTPS Operation Agent has been installed on a remote node and certificates correctly granted. Checking the connectivity with the bbcutil command for encrypted communication leads to the following error when executed on each side, agent or server:

root@linux:~ # /opt/OV/bin/bbcutil -ping ovomanagementserver.omc.hp.com

ovomanagementserver.hp.com: (bbc-289) status=eSSLError time=19609 ms

root@linux:~ # /opt/OV/bin/bbcutil -ping http://ovomanagementserver.hp.com

http://ovomanagementserver.hp.com: status=eServiceOK
coreID=52def78a-c60c-7546-0044-8256d848046c
bbcV=06.21.501 appN=ovbbccb appV=06.21.501
conn=1521 time=86 ms

Cause
As the non-encrypted communication does not fail, the problem should be tied only to the SSL chain. Looking at the output of the “bbcutil” command reveals that the time it takes trying to contact this node (19609 ms) is above the default timeout defined for the SSL Handshake process which is 10 seconds.

Fix
Increasing the SSL Handshake Timeout period to a higher value, clears the situation. Use the following command in both sides, server and agent, to increase the value to 5 minutes:

# ovconfchg -ns bbc.http -set SSL_HANDSHAKE_TIMEOUT 300000

Source: http://h30499.www3.hp.com/hpeb/attachments/hpeb/itrc-162/119064/1/356770.pdf