6.2.1 HBA Aborting a Transfer
When the HBA detects an error that it cannot recover from, it may need to end the transfer on the SATA interface.
To do this, the HBA asserts SYNC Escape to stop the bad FIS, and when the device is quiescent, returns to idle. The SATA device should send a D2H Register FIS at this point, with the ERR bit set to indicate an error in the transfer.
When aborting a transfer, the HBA does not wait for the D2H Register FIS before proceeding with error recovery (such as setting interrupt status bits and generating interrupts). This is because a device may be in a hung condition and cannot generate the D2H Register FIS.
6.2.2 Software Error Recovery
When an interrupt is generated due to an error condition, software will attempt to recover. Fatal errors (signified by the setting of PxIS.HBFS, PxIS.HBDS, PxIS.IFS, or PxIS.TFES) will cause the HBA to enter the ERR:Fatal state, and clear PxCMD.CR. In this state, the HBA shall not issue any new commands nor acknowledge DMA Setup FISes to process any native command queuing commands. To recover, the port must be restarted; the port is restarted by clearing PxCMD.ST to ‘0’ and then setting PxCMD.ST to ‘1’. For non-fatal errors (signified by the setting of PxIS.INFS or PxIS.OFS) the HBA continues to operate. If the transfer was aborted (see section 6.2.1), the device is expected to send a D2H Register FIS with PxTFD.STS.ERR set to ‘1’ and both PxTFD.STS.BSY and PxTFD.STS.DRQ cleared to ‘0’. Under this scenario, system software knows that the device is in a stable state and transfers may be restarted without issuing a COMRESET to the device.
For fatal errors, software must determine which commands were not processed and either re-issue them or notify higher level software that the command failed. The steps involved are listed in the following sections.
To detect an error that requires software recovery actions to be performed, software should check whether any of the following status bits are set on an interrupt: PxIS.HBFS, PxIS.HBDS, PxIS.IFS, and PxIS.TFES. If any of these bits are set, software should perform the appropriate error recovery actions based on whether non-queued commands were being issued or native command queuing commands were being issued.
6.2.2.1 Non-Queued Error Recovery
The flow for system software to recover from an error when non-queued commands are issued is as follows:
• Reads PxCI to see which commands are still outstanding
• Reads PxCMD.CCS to determine the slot that the HBA was processing when the error occurred
• Clears PxCMD.ST to ‘0’ to reset the PxCI register, waits for PxCMD.CR to clear to ‘0’ • Clears any error bits in PxSERR to enable capturing new errors.
• Clears status bits in PxIS as appropriate
• If PxTFD.STS.BSY or PxTFD.STS.DRQ is set to ‘1’, issue a COMRESET to the device to put it in an idle state
• Sets PxCMD.ST to ‘1’ to enable issuing new commands
• Optionally issue a command to gather information about the error, for example READ LOG EXT, if software did not have to perform a reset (COMRESET or software reset) as part of the error recovery.
Software then either completes the command that had the error and commands still outstanding with error to higher level software, or re-issues these commands to the device.
6.2.2.2 Native Command Queuing Error Recovery
The flow for system software to recover from an error when native command queuing commands are issued is as follows:
• Reads PxSACT to see which commands have not yet completed
• Clears PxCMD.ST to ‘0’ to reset the PxCI and PxSACT registers, waits for PxCMD.CR to clear to ‘0’
• Clears any error bits in PxSERR to enable capturing new errors. • Clears status bits in PxIS as appropriate
• If PxTFD.STS.BSY or PxTFD.STS.DRQ is set to ‘1’, issue a COMRESET to the device to put it in an idle state
• Sets PxCMD.ST to ‘1’ to enable issuing new commands
• Issue READ LOG EXT to determine the cause of the error condition if software did not have to perform a reset (COMRESET or software reset) as part of the error recovery
Software then either completes commands that did not finish with error to higher level software, or re- issues them to the device.
6.2.2.3 Recovery of Unsolicited COMINIT
An unsolicited COMINIT is a COMINIT that is not received as a consequence of issuing a COMRESET to the device (refer to Serial ATA Revision 2.6). If the HBA receives an unsolicited COMINIT during normal operation, the HBA shall perform the following actions:
• Respond to the device with a COMRESET
• Halt execution until PxIS.PCS is cleared to ‘0’ by software
To detect this condition, software should check whether PxIS.PCS is set to ‘1’ on an interrupt. The HBA cannot guarantee that a device received a COMRESET because a COMINIT may appear to be solicited to the HBA if it happens to occur closely to an issued COMRESET. Therefore, when software detects that PxIS.PCS is set, software should first issue a COMRESET to ensure that the device receives a COMRESET. Then software should perform the appropriate actions to clear PxIS.PCS to ‘0’. To recover, software should perform error recovery actions for a fatal error condition (including restarting the controller). Then software should perform a re-enumeration to check whether a new device has been inserted.