• No se han encontrado resultados

Garantía limitada de Epson

In document Manual del usuario PowerLite W16 (página 157-162)

The system inventory and maintenance service reports system changes with different degrees of severity. Use the reported alarms to monitor the overall health of the system.

In the following tables the severity of the alarms is represented by one or more letters, as follows: • C: Critical

• M: Major • m: Minor • W: Warning

A comma-separated list of letters is used when the alarm can be triggered with one of several severity levels. Resource Alarms

Alarm

ID Reason Text Entity Instance ID Severity Proposed Repair Action

100.101 Platform CPU threshold exceeded; threshold x%, actual y% .

host=<hostname> C, M, m Monitor and if condition persists, contact next level of support.

Alarm

ID Reason Text Entity Instance ID Severity Proposed Repair Action

100.102 VSwitch CPU threshold exceeded; threshold x%, actual y% .

host=<hostname> C, M, m Monitor and if condition persists, contact next level of support. 100.103 Memory threshold

exceeded; threshold x%, actual y% .

host=<hostname> C, M, m Monitor and if condition persists, contact next level of support; may require additional memory on Host. 100.104 File System threshold

exceeded; threshold x%, actual y% . host=<hostname>.filesystem=<mount- dir> OR filesystem=<mount-dir>

C, M, m Monitor and if condition persists, contact next level of support.

host=<hostname>.volumegroup=<volumegroup-

name> Monitor and if condition persists,consider adding additional physical volumes to the volume group. 100.105 No access to remote VM

volumes. host=<hostname> M Check Management andInfrastructure Networks and Controller or Storage Nodes. 100.106 'OAM' Port failed. host=<hostname>.port=<port-

name> M Check cabling and far-end portconfiguration and status on adjacent equipment.

100.107 'OAM' Interface degraded. OR

'OAM' Interface failed.

host=<hostname>.interface=<if-

name> C, M Check cabling and far-end portconfiguration and status on adjacent equipment.

100.108 'MGMT' Port failed. host=<hostname>.port=<port-

name> M Check cabling and far-end portconfiguration and status on adjacent equipment. 100.109 'MGMT' Interface degraded. OR 'MGMT' Interface failed. host=<hostname>.interface=<if-

name> C, M Check cabling and far-end portconfiguration and status on adjacent equipment.

100.110 'INFRA' Port failed. host=<hostname>.port=<port-

name> M Check cabling and far-end portconfiguration and status on adjacent equipment.

100.111 'INFRA' Interface degraded. OR

'INFRA' Interface failed.

host=<hostname>.interface=<if-

name> C, M Check cabling and far-end portconfiguration and status on adjacent equipment.

Maintenance Alarms

Alarm ID Reason Text Entity Instance ID Severity Proposed Repair Action 200.001 Host was

administratively locked to take it out-of-service.

host=<hostname> W Administratively unlock Host to bring it back in-service.

200.004 Host experienced a service-affecting failure.

Resetting Host.

host=<hostname> C If problem consistently occurs after Host is reset, contact next level of support or lock and replace failing host. 200.005 Degrade: Host is experiencing an intermittent 'Management Network' communication failure that has exceeded its lower alarming threshold. Failure: Host is experiencing a persistent critical 'Management Network' communication failure. Resetting Host.

host=<hostname> C, M If problem consistently occurs after Host is reset, contact next level of support or lock and replace failing host. 200.009 Degrade: Host is experiencing an intermittent 'Infrastructure Network' communication failure that has exceeded its lower alarming threshold. Failure: Host is experiencing a persistent critical 'Infrastructure Network' communication failure. Resetting Host.

host=<hostname> C, M If problem consistently occurs after Host is reset, contact next level of support or lock and replace failing host.

200.006 One or more Critical:'<process list>' and/or Degraded:'<process list>' processes on Host

host=<hostname> C, M If problem consistently occurs after Host is reset, contact next level of support or lock and replace failing host.

Alarm ID Reason Text Entity Instance ID Severity Proposed Repair Action have failed and can not

be recovered. 200.007 Critical: (with host

degrade):

Host is degraded due to a 'critical' out-of- tolerance reading from the '<sensorname>' sensor

Major: (with host degrade)

Host is degraded due to a 'major' out-of- tolerance reading from the '<sensorname>' sensor

Minor:

Host is reporting a 'minor' out-of-tolerance reading from the '<sensorname>' sensor

host=<hostname>.sensor=<sensorname>C, M, m If problem consistently occurs after Host is power cycled and or reset, contact next level of support or lock and replace failing host.

200.008 ntpd' process has failed

on Host. host=<hostname> m 'ntpd' is a process that can not beauto recovered. The Host must be re-enabled (locked and then unlocked) to clear this alarm. If the alarm continues to persist then contact next level of support to investigate and recover. 200.010 Access to board

management module has failed.

host=<hostname> W Check Host's board management configuration and connectivity. 200.011 Host encountered a

critical configuration failure during

initialization. Resetting Host.

host=<hostname> C If problem consistently occurs after Host is reset, contact next level of support or lock and replace failing host.

200.0112 In-Service failure of host's controller function while compute services remain healthy.

host=<hostname> M Lock and then Unlock host to recover. Avoid using 'Force Lock' action as that will impact compute services running on this host. If lock action fails then contact next level of support to investigate and recover.

Alarm ID Reason Text Entity Instance ID Severity Proposed Repair Action 200.013 In-Service failure of

host's compute function on host with only available and healthy controller service.

host=<hostname> M Enable second controller as soon as possible and then optionally Lock and Unlock host to recover local compute services on this host. 200.014 The Hardware Monitor

was unable to load, configure and monitor one or more hardware sensors.

host=<hostname> m Check Board Management Controller provisioning. Try reprovisioning the BMC. If problem persists try power cycling the host and then the entire server including the BMC power. If problem persists then contact next level of support. Storage Alarms

Alarm

ID Reason Text Entity Instance ID Severity Proposed Repair Action

800.001 Storage Alarm Condition: 1 mons down, quorum 1,2 controller-1,storage-0

cluster=<dist-fs-uuid> C, M If problem persists, contact next level of support.

Data Networking Alarms Alarm

ID Reason Text Entity Instance ID Severity Proposed Repair Action

300.001 'Data' Port failed. host=<hostname>.port=<port-

uuid> M Check cabling and far-end portconfiguration and status on adjacent equipment.

300.002 'Data' Interface degraded. OR

'Data' Interface failed.

host=<hostname>.interface=<if-

uuid> M, C Check cabling and far-end portconfiguration and status on adjacent equipment.

300.003 Networking Agent not

responding. host=<hostname>.agent=<agent-uuid> M If condition persists, attempt toclear issue by administratively locking and unlocking the Host. 300.004 No enabled compute

host with connectivity to provider network.

host=<hostname>.providernet=<pnet-

Controller HA Alarms Alarm

ID Reason Text Entity Instance ID Severity Proposed Repair Action

400.001 Service group failure; <list of affected services>.

OR

Service group degraded; <list of affected services>.

OR

Service group warning; <list of affected services>.

service_domain=<domain_name>.service_group=<group_name>.host=<hostname>C, M, m Contact next level of support.

400.002 Service group loss of redundancy; expected <num> standby member<s> but only <num> standby member<s> available.

OR

Service group loss of

redundancy; expected <num> standby member<s> but only <num> standby member<s> available.

OR

Service group loss of

redundancy; expected <num> active member<s> but no active members available.

OR

Service group loss of

redundancy; expected <num> active member<s> but only <num> active member<s> available.

service_domain=<domain_name>.service_group=<group_name>M Bring a controller node back in to service, otherwise contact next level of support.

400.003 License key has expired or is invalid; a valid license key is required for operation. OR

Evaluation license key will expire on <date>; there are <num_days> days remaining in this evaluation.

OR

host=<hostname> C Contact next level of support to obtain a new license key.

Alarm

ID Reason Text Entity Instance ID Severity Proposed Repair Action

Evaluation license key will expire on <date>; there is only 1 day remaining in this evaluation. 400.004 Service group software

modification detected; <list of affected files>.

host=<hostname> M Contact next level of support.

400.005 Communication failure detected with peer over port <linux- ifname>.

OR

Communication failure detected with peer over port <linux- ifname> within the last 30 seconds.

host=<hostname>.network=<mgmt

| oam | infra> M Check cabling and far-end portconfiguration and status on adjacent equipment.

Backup and Restore Alarms Alarm

ID Reason Text Entity Instance ID Severity Proposed Repair Action

210.001 System Backup in

progress. host=controller m No action required. System Configuration

Alarm

ID Reason Text Entity Instance ID Severity Proposed Repair Action

250.001 <hostname> Configuation

is out-of-date. host=<hostname> M Administratively lock and unlock<hostname> to update config. 250.01 <hostname> Provisioning

compute required. (This alarm only applies in the 2-Server/Combined load).

host=<hostname> M Administratively lock and unlock <hostname> to update provisioning of Coimpute functionality.

Software Management Alarms Alarm

ID Reason Text Entity Instance ID Severity Proposed Repair Action

900.001 Patching operation in

progress. host=controller m Complete reboots of affected hosts. 900.002 Obsolete patch in system. host=controller W Remove and delete obsolete

Alarm

ID Reason Text Entity Instance ID Severity Proposed Repair Action

900.003 Patch host install failure. host=<hostname> M Undo patching operation. Virtual Machine Instance Alarms

Alarm

ID Reason Text Entity Instance ID Severity Proposed Repair Action

700.003 Instance <instance-name>

is failed instance=<instance_uuid> C The system will attempt recovery;no repair action required 700.007 Instance <instance-name>

is paused instance=<instance_uuid> C Unpause the instance 700.009 Instance <instance-name>

is suspended instance=<instance_uuid> C Resume the instance 700.012 Instance <instance-name>

is live migrating instance=<instance_uuid> W Wait for live migration to complete;if problem persists contact next level of support

700.013 Instance <instance-name>

is cold migrating instance=<instance_uuid> C Wait for cold migration tocomplete; if problem persists contact next level of support 700.014 Instance <instance-name>

has been cold-migrated instance=<instance_uuid> C Confirm or revert cold-migrate ofinstance 700.017 Instance <instance-name>

is evacuating instance=<instance_uuid> C Wait for evacuate to complete; ifproblem persists contact next level of support

700.020 Instance <instance-name>

is stopped instance=<instance_uuid> C Start the instance 700.021 Instance <instance-name>

is rebooting instance=<instance_uuid> C Wait for reboot to complete; ifproblem persists contact next level of support

700.022 Instance <instance-name>

is rebuilding instance=<instance_uuid> C Wait for rebuild to complete; ifproblem persists contact next level of support

700.023 Instance <instance-name>

is resizing instance=<instance_uuid> C Wait for resize to complete; ifproblem persists contact next level of support

700.024 Instance <instance-name>

Alarm

ID Reason Text Entity Instance ID Severity Proposed Repair Action

700.027 Guest Heartbeat not established for instance <instance-name>

instance=<instance_uuid> M Verify that the instance is running the Guest-Client daemon, or disable Guest Heartbeat for the instance if no longer needed, otherwise contact next level of support

In document Manual del usuario PowerLite W16 (página 157-162)

Documento similar