[SCSI] Retry commands with UNIT_ATTENTION sense codes to fix ext3/ext4 I/O error

There's nastyness in the way we currently handle barriers (and
discards): They're effectively filesystem commands, but they get
processed as BLOCK_PC commands.  Unfortunately BLOCK_PC commands are
taken by SCSI to be SG_IO commands and the issuer expects to see and
handle any returned errors, however trivial.  This leads to a huge
problem, because the block layer doesn't expect this to happen and any
trivially retryable error on a barrier causes an immediate I/O error
to the filesystem.

The only real way to hack around this is to take the usual class of
offending errors (unit attentions) and make them all retryable in the
case of a REQ_HARDBARRIER.  A correct fix would involve a rework of
the entire block and SCSI submit system, and so is out of scope for a
quick fix.

Cc: Hannes Reinecke <hare@suse.de>
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This commit is contained in:
James Bottomley 2010-05-04 16:51:40 -04:00
parent c213e1407b
commit 77a4229719

View file

@ -302,7 +302,20 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
if (scmd->device->allow_restart &&
(sshdr.asc == 0x04) && (sshdr.ascq == 0x02))
return FAILED;
return SUCCESS;
if (blk_barrier_rq(scmd->request))
/*
* barrier requests should always retry on UA
* otherwise block will get a spurious error
*/
return NEEDS_RETRY;
else
/*
* for normal (non barrier) commands, pass the
* UA upwards for a determination in the
* completion functions
*/
return SUCCESS;
/* these three are not supported */
case COPY_ABORTED: