android_kernel_samsung_msm8976/fs/ocfs2/dlm
Junxiao Bi a093bceda6 ocfs2: dlm: fix recovery hung
commit ded2cf71419b9353060e633b59e446c42a6a2a09 upstream.

There is a race window in dlm_do_recovery() between dlm_remaster_locks()
and dlm_reset_recovery() when the recovery master nearly finish the
recovery process for a dead node.  After the master sends FINALIZE_RECO
message in dlm_remaster_locks(), another node may become the recovery
master for another dead node, and then send the BEGIN_RECO message to
all the nodes included the old master, in the handler of this message
dlm_begin_reco_handler() of old master, dlm->reco.dead_node and
dlm->reco.new_master will be set to the second dead node and the new
master, then in dlm_reset_recovery(), these two variables will be reset
to default value.  This will cause new recovery master can not finish
the recovery process and hung, at last the whole cluster will hung for
recovery.

old recovery master:                                 new recovery master:
dlm_remaster_locks()
                                                  become recovery master for
                                                  another dead node.
                                                  dlm_send_begin_reco_message()
dlm_begin_reco_handler()
{
 if (dlm->reco.state & DLM_RECO_STATE_FINALIZE) {
  return -EAGAIN;
 }
 dlm_set_reco_master(dlm, br->node_idx);
 dlm_set_reco_dead_node(dlm, br->dead_node);
}
dlm_reset_recovery()
{
 dlm_set_reco_dead_node(dlm, O2NM_INVALID_NODE_NUM);
 dlm_set_reco_master(dlm, O2NM_INVALID_NODE_NUM);
}
                                                  will hang in dlm_remaster_locks() for
                                                  request dlm locks info

Before send FINALIZE_RECO message, recovery master should set
DLM_RECO_STATE_FINALIZE for itself and clear it after the recovery done,
this can break the race windows as the BEGIN_RECO messages will not be
handled before DLM_RECO_STATE_FINALIZE flag is cleared.

A similar race may happen between new recovery master and normal node
which is in dlm_finalize_reco_handler(), also fix it.

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Reviewed-by: Wengang Wang <wen.gang.wang@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-06 07:55:32 -07:00
..
dlmapi.h ocfs2/trivial: Remove trailing whitespaces 2010-01-25 19:20:51 -08:00
dlmast.c ocfs2: trivial endianness misannotations 2012-05-29 23:28:34 -04:00
dlmcommon.h ocfs2: trivial endianness misannotations 2012-05-29 23:28:34 -04:00
dlmconvert.c ocfs2: Remove ENTRY from masklog. 2011-02-21 11:10:44 +08:00
dlmconvert.h
dlmdebug.c fs: add export.h to files using EXPORT_SYMBOL/THIS_MODULE macros 2011-10-31 19:30:31 -04:00
dlmdebug.h ocfs2/dlm: Cleanup dlmdebug.c 2010-12-22 18:34:44 -08:00
dlmdomain.c ocfs2: remove kfree() redundant null checks 2013-02-21 17:22:19 -08:00
dlmdomain.h
dlmlock.c fs/ocfs2/dlm/dlmlock.c: free kmem_cache_zalloc'd data using kmem_cache_free 2011-11-17 01:46:46 -08:00
dlmmaster.c ocfs2/dlm: use GFP_ATOMIC inside a spin_lock 2013-02-26 02:46:13 -05:00
dlmrecovery.c ocfs2: dlm: fix recovery hung 2014-05-06 07:55:32 -07:00
dlmthread.c ocfs2/dlm: Take inflight reference count for remotely mastered resources too 2011-07-24 10:29:54 -07:00
dlmunlock.c ocfs2: Remove ENTRY from masklog. 2011-02-21 11:10:44 +08:00
dlmver.c
dlmver.h
Makefile fs: change to new flag variable 2011-03-17 14:02:57 +01:00