diff mbox series

[v7,04/10] selftests/harness: Fix interleaved scheduling leading to race conditions

Message ID 20240511171445.904356-5-mic@digikod.net
State Accepted
Commit a86f18903db9211e265cc130b61adb175b7a4c42
Headers show
Series Fix Kselftest's vfork() side effects | expand

Commit Message

Mickaël Salaün May 11, 2024, 5:14 p.m. UTC
Fix a race condition when running several FIXTURE_TEARDOWN() managing
the same resource.  This fixes a race condition in the Landlock file
system tests when creating or unmounting the same directory.

Using clone3() with CLONE_VFORK guarantees that the child and grandchild
test processes are sequentially scheduled.  This is implemented with a
new clone3_vfork() helper replacing the fork() call.

This avoids triggering this error in __wait_for_test():
  Test ended in some other way [127]

Cc: Christian Brauner <brauner@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Günther Noack <gnoack@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Will Drewry <wad@chromium.org>
Fixes: 41cca0542d7c ("selftests/harness: Fix TEST_F()'s vfork handling")
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20240511171445.904356-5-mic@digikod.net
---

Changes since v2:
* Replace __attribute__((__unused__)) with inline for clone3_vfork()
  (suggested by Kees and Jakub)
---
 tools/testing/selftests/kselftest_harness.h | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

Comments

Mark Brown May 27, 2024, 7:07 p.m. UTC | #1
On Sat, May 11, 2024 at 07:14:39PM +0200, Mickaël Salaün wrote:

> Fix a race condition when running several FIXTURE_TEARDOWN() managing
> the same resource.  This fixes a race condition in the Landlock file
> system tests when creating or unmounting the same directory.

> Using clone3() with CLONE_VFORK guarantees that the child and grandchild
> test processes are sequentially scheduled.  This is implemented with a
> new clone3_vfork() helper replacing the fork() call.

This is now in mainline and appears to be causing several tests (at
least the ptrace vmaccess global_attach test on arm64, possibly also
some of the epoll tests) that previously were timed out by the harness
to to hang instead.  A bisect seems to point at this patch in
particular, there was a bunch of discussion of the fallout of these
patches but I'm afraid I lost track of it, is there something in flight
for this?  -next is affected as well from the looks of it.

Log of the ptrace issue (not that it's useful, the job just gets killed
by the test runner):

   https://lava.sirena.org.uk/scheduler/job/307984

Bisect log:

git bisect start
# status: waiting for both good and bad commits
# good: [e8f897f4afef0031fe618a8e94127a0934896aba] Linux 6.8
git bisect good e8f897f4afef0031fe618a8e94127a0934896aba
# status: waiting for bad commit, 1 good commit known
# bad: [a38297e3fb012ddfa7ce0321a7e5a8daeb1872b6] Linux 6.9
git bisect bad a38297e3fb012ddfa7ce0321a7e5a8daeb1872b6
# good: [480e035fc4c714fb5536e64ab9db04fedc89e910] Merge tag 'drm-next-2024-03-13' of https://gitlab.freedesktop.org/drm/kernel
git bisect good 480e035fc4c714fb5536e64ab9db04fedc89e910
# good: [2ac2b1665d3fbec6ca709dd6ef3ea05f4a51ee4c] Merge tag 'hwlock-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux
git bisect good 2ac2b1665d3fbec6ca709dd6ef3ea05f4a51ee4c
# good: [e858beeddfa3a400844c0e22d2118b3b52f1ea5e] bcachefs: If we run merges at a lower watermark, they must be nonblocking
git bisect good e858beeddfa3a400844c0e22d2118b3b52f1ea5e
# good: [e43afae4a335ac0bf54c7a8f23ed65dd55449649] Merge tag 'powerpc-6.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
git bisect good e43afae4a335ac0bf54c7a8f23ed65dd55449649
# good: [09e10499ee6a5a89fc352f25881276398a49596a] Merge tag 'drm-misc-fixes-2024-05-02' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
git bisect good 09e10499ee6a5a89fc352f25881276398a49596a
# good: [3c15237018eb8a9a56bb49a5dbf4d8eeb154b5cc] Merge tag 'usb-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
git bisect good 3c15237018eb8a9a56bb49a5dbf4d8eeb154b5cc
# good: [62788b0f225da1837ad38101112e2c49123470ee] Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux
git bisect good 62788b0f225da1837ad38101112e2c49123470ee
# good: [ed44935c330a2633440e8d2660db3c7538eeaf10] Merge tag 'spi-fix-v6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
git bisect good ed44935c330a2633440e8d2660db3c7538eeaf10
# good: [c22c3e0753807feee1391a22228b0d5e6ba39b74] Merge tag 'mm-hotfixes-stable-2024-05-10-13-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
git bisect good c22c3e0753807feee1391a22228b0d5e6ba39b74
# good: [b61821bb32c5577272408e1b05e6a0879a64257f] Merge tag 'drm-misc-fixes-2024-05-10' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
git bisect good b61821bb32c5577272408e1b05e6a0879a64257f
# bad: [323feb3bdb67649bfa5614eb24ec9cb92a60cf33] selftests/harness: Handle TEST_F()'s explicit exit codes
git bisect bad 323feb3bdb67649bfa5614eb24ec9cb92a60cf33
# bad: [323feb3bdb67649bfa5614eb24ec9cb92a60cf33] selftests/harness: Handle TEST_F()'s explicit exit codes
git bisect bad 323feb3bdb67649bfa5614eb24ec9cb92a60cf33
# bad: [3656bc23429a4d539c81b5cb8f17ceeeeca8901a] selftests/landlock: Do not allocate memory in fixture data
git bisect bad 3656bc23429a4d539c81b5cb8f17ceeeeca8901a
# good: [7e4042abe2ee7c0977fd8bb049a6991b174a5e6f] selftests/landlock: Fix FS tests when run on a private mount point
git bisect good 7e4042abe2ee7c0977fd8bb049a6991b174a5e6f
# bad: [a86f18903db9211e265cc130b61adb175b7a4c42] selftests/harness: Fix interleaved scheduling leading to race conditions
git bisect bad a86f18903db9211e265cc130b61adb175b7a4c42
# good: [fff37bd32c7605d93bf900c4c318d56d12000048] selftests/harness: Fix fixture teardown
git bisect good fff37bd32c7605d93bf900c4c318d56d12000048
# first bad commit: [a86f18903db9211e265cc130b61adb175b7a4c42] selftests/harness: Fix interleaved scheduling leading to race conditions
diff mbox series

Patch

diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h
index 55699a762c45..9d7178a71c2c 100644
--- a/tools/testing/selftests/kselftest_harness.h
+++ b/tools/testing/selftests/kselftest_harness.h
@@ -66,6 +66,8 @@ 
 #include <sys/wait.h>
 #include <unistd.h>
 #include <setjmp.h>
+#include <syscall.h>
+#include <linux/sched.h>
 
 #include "kselftest.h"
 
@@ -80,6 +82,17 @@ 
 #  define TH_LOG_ENABLED 1
 #endif
 
+/* Wait for the child process to end but without sharing memory mapping. */
+static inline pid_t clone3_vfork(void)
+{
+	struct clone_args args = {
+		.flags = CLONE_VFORK,
+		.exit_signal = SIGCHLD,
+	};
+
+	return syscall(__NR_clone3, &args, sizeof(args));
+}
+
 /**
  * TH_LOG()
  *
@@ -1183,7 +1196,7 @@  void __run_test(struct __fixture_metadata *f,
 	fflush(stdout);
 	fflush(stderr);
 
-	t->pid = fork();
+	t->pid = clone3_vfork();
 	if (t->pid < 0) {
 		ksft_print_msg("ERROR SPAWNING TEST CHILD\n");
 		t->exit_code = KSFT_FAIL;