namei: results of d_is_negative() should be checked after dentry revalidation
authorTrond Myklebust <trond.myklebust@primarydata.com>
Fri, 9 Oct 2015 17:44:34 +0000 (13:44 -0400)
committerLinus Torvalds <torvalds@linux-foundation.org>
Sat, 10 Oct 2015 17:17:27 +0000 (10:17 -0700)
Leandro Awa writes:
 "After switching to version 4.1.6, our parallelized and distributed
  workflows now fail consistently with errors of the form:

  T34: ./regex.c:39:22: error: config.h: No such file or directory

  From our 'git bisect' testing, the following commit appears to be the
  possible cause of the behavior we've been seeing: commit 766c4cbfacd8"

Al Viro says:
 "What happens is that 766c4cbfacd8 got the things subtly wrong.

  We used to treat d_is_negative() after lookup_fast() as "fall with
  ENOENT".  That was wrong - checking ->d_flags outside of ->d_seq
  protection is unreliable and failing with hard error on what should've
  fallen back to non-RCU pathname resolution is a bug.

  Unfortunately, we'd pulled the test too far up and ran afoul of
  another kind of staleness.  The dentry might have been absolutely
  stable from the RCU point of view (and we might be on UP, etc), but
  stale from the remote fs point of view.  If ->d_revalidate() returns
  "it's actually stale", dentry gets thrown away and the original code
  wouldn't even have looked at its ->d_flags.

  What we need is to check ->d_flags where 766c4cbfacd8 does (prior to
  ->d_seq validation) but only use the result in cases where we do not
  discard this dentry outright"

Reported-by: Leandro Awa <lawa@nvidia.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=104911
Fixes: 766c4cbfacd8 ("namei: d_is_negative() should be checked...")
Tested-by: Leandro Awa <lawa@nvidia.com>
Cc: stable@vger.kernel.org # v4.1+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
fs/namei.c

index 726d211..33e9495 100644 (file)
@@ -1558,8 +1558,6 @@ static int lookup_fast(struct nameidata *nd,
                negative = d_is_negative(dentry);
                if (read_seqcount_retry(&dentry->d_seq, seq))
                        return -ECHILD;
-               if (negative)
-                       return -ENOENT;
 
                /*
                 * This sequence count validates that the parent had no
@@ -1580,6 +1578,12 @@ static int lookup_fast(struct nameidata *nd,
                                goto unlazy;
                        }
                }
+               /*
+                * Note: do negative dentry check after revalidation in
+                * case that drops it.
+                */
+               if (negative)
+                       return -ENOENT;
                path->mnt = mnt;
                path->dentry = dentry;
                if (likely(__follow_mount_rcu(nd, path, inode, seqp)))