PARTE I LA PROTECCIÓN HISTÓRICA DE LA VEJEZ: EVOLUCIÓN Y PRINCIPALES CARACTERÍSTICAS.Y PRINCIPALES CARACTERÍSTICAS
3. La vejez y el Estado: las políticas sociales de protección de la vejez
This section describes some basic procedures for an “unbootable” SP on a CX Series or CX3 Series array. The same basic guidelines apply regardless of whether the array has been running normally or an SP has already been re-imaged in an attempt to get an SP up, or a platform upgrade (conversion) is in progress.
Note: This section contains information written for an engineering/sustaining environment. This includes some information that may NOT be suitable for customer environments. The information should not be distributed to and is not intended to replace or supersede official product support documentation.
Private Space Reference
The following diagram shows the locations of the Flare Boot Partitions and Utility Partitions in private space. This diagram is not drawn to scale. For more details, see the following specifications:
Disk 0 Disk 1 Disk 2 Disk 3 Disk 4
Flare Boot Partition SPA Primary
Flare Boot Partition SPB Primary
Flare Boot Partition SPA Secondary
Flare Boot Partition SPB Secondary Utility Partition
SPB Primary
Utility Partition SPA Primary
Utility Partition SPB Secondary
Utility Partition SPA Secondary Image Repository (CX3
Series only)
Image Repository (CX3 Series only)
Image Repository
(all CX and CX3) For CX200/400/600 and CX300/500/700 platforms
For CX3-20, CX3-40, CX3-80 platforms
“SP Will Not Boot”
When an SP is “unmanaged” or “unbootable”, there are several possible root causes. The word “boot” has many meanings, so it is important to determine exactly where the boot process failed. See further down in this section for additional information about the boot process.
Note: Re-imaging does not solve all problems, and can leave the SP in a worse state than when you started. The current policy is that EMC Technical Support Level 2 or EMC Engineering or both should be contacted before any data-in-place re-imaging operations are performed on CX-series arrays.
First Steps To Try
Always start with Primus solution emc111000, which contains a link to troubleshooting trees for all CX arrays. Much of this section is derived from those trees, while this section is designed to only be a “quick reference”.
1. Ping the SP. Use “ping <ip address> –t” at a command prompt and allow it to run for several minutes in case the array is in a reboot loop. Make sure the SP network cable is connected correctly.
a. If the SP is pingable, the OS image for the SP is probably OK. Do NOT re-image. The SP may be degraded. Try to access the SP using EMC Remote.
i. If you can access the SP using EMC Remote, debug the SP as a Degraded SP.
ii. If you cannot access the SP using EMC Remote, you should always try to Force Degraded Mode.
b. If the SP is not pingable, try to establish a serial (PPP) connection to the SP. Note that if the SP was recently re-imaged, the IP address may not have been restored from the PSM after re-imaging if the SP booted in Degraded Mode.
2. [CX3 Series only] Check the system event log on the PEER SP for “peer boot logging” messages. Those logs may indicate a hardware problem. The peer SP will also log if the local SP is in Degraded Mode. Note that Flare must be running normally on the peer SP, or these messages will not be logged.
3. Check SP Fault light and Extended POST output. Check the amber SP Fault LED located on the air dam of the SP. If the SP is running normally, this LED is turned off. If the LED is off or on “solid”, re-imaging is unlikely to solve the problem. You should also collect Extended POST output available from HyperTerminal. This is critical information in some situations.
4. Try to get the SP into Degraded Mode (see Primus solution emc76039 for an important note about older Flare revs). If the SP can boot in Degraded Mode and becomes pingable, do not re-image the SP. Debug the SP as a Degraded SP.
5. Check the health of the PEER SP. What is the state of the peer SP? Is it running normally? or in Degraded Mode, or pingable? If SPCollects from the peer SP are available, check for disk and/or backend loop issues that could be affecting the local SP. Re-imaging one SP while the peer is also having problems is NOT recommended, since this could make the situation worse!
a. For example, if the peer SP is "degraded", the PSM may have a problem and be inaccessible, which could cause a re-image of an SP to leave the SP in an unmanaged state, because the re-imaged SP would be unable to access its IP address and other configuration information stored in the PSM.
6. Look for a hardware problem. If the SP cannot boot even in Degraded Mode, and the peer SP is running normally, there is either a hardware problem or re-imaging is needed. You must try to determine if a hardware problem
(SP/disk/cable/enclosure) exists before trying to re-image. Check that backend zero is connected properly and there are no fault lights on the boot drives. Note that SPA boots from drives 0 and 2. SPB boots from drives 1 and 3. Try to isolate the problem by reducing the configuration (fewer enclosures and drives).
7. If the SP cannot boot in Degraded Mode, and no hardware-related problems are found, then re-imaging is the only option left. Only do so under the direction of EMC Technical Support Level 2 or EMC Engineering.
CX Boot Failure Modes
Note: Primus solution emc111000 contains a the following matrix that is suitable for the field, and should be consulted during escalations. Primus solutions emc76039 and emc66446 also contain some tips in this area.
Additional Notes:
Both SP’s failing - If both SP's of an array are failing, it is mostly likely not bad SP hardware; but rather a software (database) problem, or backend issue. Do not swap SP hardware or reimage. The SP’s are most likely in a hung or degraded state (depending on which other symptoms most closely match.)
Network Connectivity - Note that the ability to ping an SP assumes that an IP address has been configured for that SP and that the SP has not been reimaged (after imaging a failing SP may need its IP address set again.) If an IP address has not been set, the SP will not be pingable and will report “unmanaged” regardless of its boot state. In addition, a bad network cable or network switch may impact network connectivity. If all other indications are that the SP is healthy, it may be a cable or switch issue.
Establishing a PPP connection to an SP
LAN Service Port (CX3 Series only)
If an SP is not pingable, but you suspect that the SP is actually running (example: the SP Fault LED is off, or FCLI on the peer SP shows the local SP as PRESENT), then there could be a problem on the customer’s network. You may want to use the LAN Service Port to help confirm this. The LAN Service Port creates a direct-connect "virtual LAN" interface between a Service laptop and an SP.
1. Use a regular IP cable - a special cross-over cable is not needed.
2. To access an SP, you need to connect to the LAN Service Port on its peer SP. Details below.
3. Configure the Service laptop as follows:
IP address = 128.221.1.249 or 128.221.1.254 Subnet Mask = 255.255.255.248
Gateway = None (use blank spaces; the LAN Service Port is direct connection only) IMPORTANT: Do not connect the Service laptop to the Customer LAN while it is configured this way.
4. To access SPA, connect IP cable from Service laptop to LAN Service Port on SPB.
SPA's Service Port IP address will be 128.221.1.250
5. To access SPB, connect IP cable from Service laptop to LAN Service Port on SPA.
SPB's Service Port IP Address will be 128.221.1.251
IMPORTANT: Never connect the Corporate LAN to either LAN Service Port.
6. Attempt to ping the appropriate Service Port IP address as listed in Step 4(a/b). If successful replies are received, the SP should be considered "pingable", and a network issue with the Customer LAN or the SP should be suspected.
EMC Remote password changes in R24 (and beyond)
1. Customers can use Navisphere to change their EMC Remote username/password.
2. If an SP is not accessible using EMC Remote, make sure you are using the correct username/password.
3. When connecting using an IP address, the only valid default username/password for EMC Remote is
clariion1992/clariion1992. The clariion/clariion! username/password no longer works for EMC Remote when connecting using an IP address. Note that clariion1992/clariion1992 also worked on arrays running pre-R24 code.
4. If you establish a PPP session to an SP over the serial port, the username/password of clariion/clariion! should always work.
5. You can still use clariion/clariion! or clariion1992/clariion1992 as the Windows logon password.
SP Fault LED Blink Rates
The SP Fault LED is an amber LED located on the air dam of the SP. During a normal boot of the Flare Partition, the following blink rates will be seen:
Blink Rate Interpretation
¼ Hz Power up and BIOS Initialization Phase
½ Hz Extended POST Testing Phase
4 Hz Operating System Boot Phase – Windows may or may not have started. Flare/NDUMON have not fully started, so the SP is not ready to handle Flare IO. It may be possible to connect to the SP via PPP (and EMCRemote) since the SP may be in degraded mode.
Off Boot success, ready for Flare IO – This is the normal case after booting. Flare and NDUMON have started. SP should be accessible via ping, Dial-Up Networking, or EMCRemote.
The following blink rates are for special cases:
Blink Rate Interpretation
2 Hz NMI button pressed (CX3 Series only) 1 – 3 – 3 - 1 Bad DIMM detected.
On (solid) Flare has turned on the fault light due to a hardware issue.
Summary of Boot Process
Here is a quick summary of the major steps in the boot process in each of the three boot modes:
Step Normal Case Degraded Mode* HFOff
1 BIOS Output displayed on
2 Extended POST “Alphabet string” displayed on HyperTerminal, followed by “INT
3 OS Boot Reboot count incremented SP becomes pingable
Reboot count incremented SP becomes pingable
Reboot count incremented SP becomes pingable
4 Flare starts Flare does not start, drivers
cannot be started manually
Flare does not start, drivers can be started manually 5 EMC Remote agent starts SP accessible using EMC
Remote
SP accessible using EMC Remote
SP accessible using EMC Remote
6 NduMon starts SP fault LED turns off, Reboot count cleared,
Front-ends opened for IO
SP fault LED contines to blink, Reboot count cleared.
Does not start
(Reboot count is not cleared) 7 Navisphere agent starts SP becomes manageable using
Navi
Does not start Does not start
*Reboot count tripped, or “Force Degraded Mode” flag set in Extended POST. On CX3 Series, the system event log on the peer SP should contain an informational message from flaredrv - “SPx Status: (37) In Degraded Mode”.
**The SP Fault LED will continue to blink at different frequencies until it is turned off at the end of the “normal” boot process. See above for a description of the SP fault LED blink rates.
CX200 / CX400 / CX600 POWERUP