RFC: make fastcheck

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

RFC: make fastcheck

Carlos O'Donell-6
Florian,

What do you think of making a 'fastcheck' that can run all tests which have
timeouts less than or equal to the default timeout?

My idea here is as follows:

- 'make fastcheck' is for developer smoke testing during edit/build/test cycle.
- 'make check' if for broader developer testing, and will, if required, update
   the chroot for container testing (would have fixed Rafal's observed breakage).
- 'make xcheck' runs everything including the expensive tests and tests requiring
   explicit permissions.

Then you could switch your workflow to using 'fastcheck'?

This is one part of fixing the workflow.

The other part is actually fixing 'make' to go faster, but DJ and I talked about
this recently and DJ has some ideas.

--
Cheers,
Carlos.
Reply | Threaded
Open this post in threaded view
|

Re: RFC: make fastcheck

Florian Weimer-5
* Carlos O'Donell:

> Florian,
>
> What do you think of making a 'fastcheck' that can run all tests which have
> timeouts less than or equal to the default timeout?
>
> My idea here is as follows:
>
> - 'make fastcheck' is for developer smoke testing during edit/build/test cycle.
> - 'make check' if for broader developer testing, and will, if required, update
>   the chroot for container testing (would have fixed Rafal's observed breakage).
> - 'make xcheck' runs everything including the expensive tests and tests requiring
>   explicit permissions.
>
> Then you could switch your workflow to using 'fastcheck'?
>
> This is one part of fixing the workflow.
>
> The other part is actually fixing 'make' to go faster, but DJ and I talked about
> this recently and DJ has some ideas.

Build time on my test machine:

real    1m40.633s
user    8m19.845s
sys     2m6.309s

The test time is:

real    14m5.543s
user    29m29.308s
sys     7m27.912s


Now apply this patch:

diff --git a/support/support_test_main.c b/support/support_test_main.c
index 7e7b9edbb0..f196990a3f 100644
--- a/support/support_test_main.c
+++ b/support/support_test_main.c
@@ -209,6 +209,9 @@ adjust_exit_status (int status)
 int
 support_test_main (int argc, char **argv, const struct test_config *config)
 {
+  if (config->timeout > DEFAULT_TIMEOUT)
+    return 0;
+
   if (test_main_called)
     {
       printf ("error: test_main called for a second time\n");

The new testing time is:

real    10m35.001s
user    25m55.020s
sys     5m21.004s

So there is a definite improvement.  But I don't think it doesn't solve
the core problem because it's still too long.

It also doesn't help with build-many-glibcs.py because that doesn't run
any tests which could time out.

Thanks,
Florian
Reply | Threaded
Open this post in threaded view
|

Re: RFC: make fastcheck

Carlos O'Donell-6
On 7/18/19 4:13 AM, Florian Weimer wrote:

> * Carlos O'Donell:
>
>> Florian,
>>
>> What do you think of making a 'fastcheck' that can run all tests which have
>> timeouts less than or equal to the default timeout?
>>
>> My idea here is as follows:
>>
>> - 'make fastcheck' is for developer smoke testing during edit/build/test cycle.
>> - 'make check' if for broader developer testing, and will, if required, update
>>    the chroot for container testing (would have fixed Rafal's observed breakage).
>> - 'make xcheck' runs everything including the expensive tests and tests requiring
>>    explicit permissions.
>>
>> Then you could switch your workflow to using 'fastcheck'?
>>
>> This is one part of fixing the workflow.
>>
>> The other part is actually fixing 'make' to go faster, but DJ and I talked about
>> this recently and DJ has some ideas.
>
> Build time on my test machine:
>
> real    1m40.633s
> user    8m19.845s
> sys     2m6.309s
>
> The test time is:
>
> real    14m5.543s
> user    29m29.308s
> sys     7m27.912s
>
>
> Now apply this patch:
>
> diff --git a/support/support_test_main.c b/support/support_test_main.c
> index 7e7b9edbb0..f196990a3f 100644
> --- a/support/support_test_main.c
> +++ b/support/support_test_main.c
> @@ -209,6 +209,9 @@ adjust_exit_status (int status)
>   int
>   support_test_main (int argc, char **argv, const struct test_config *config)
>   {
> +  if (config->timeout > DEFAULT_TIMEOUT)
> +    return 0;
> +
>     if (test_main_called)
>       {
>         printf ("error: test_main called for a second time\n");
>
> The new testing time is:
>
> real    10m35.001s
> user    25m55.020s
> sys     5m21.004s
>
> So there is a definite improvement.  But I don't think it doesn't solve
> the core problem because it's still too long.

Sorry, I failed to list a critical part of my own reasoning.

I intended to write that the test-container tests would *not* run in
make fastcheck.

How much does that improve the timing by?

> It also doesn't help with build-many-glibcs.py because that doesn't run
> any tests which could time out.

That's the second part of this problem. I agree it's a problem.

However, testing, even if we fix the build infrastructure, will always
grow in duration, and we need a way to handle this with a "profile" which
allows developers to smoke test their build.

--
Cheers,
Carlos.
Reply | Threaded
Open this post in threaded view
|

Re: RFC: make fastcheck

Florian Weimer-5
* Carlos O'Donell:

> I intended to write that the test-container tests would *not* run in
> make fastcheck.
>
> How much does that improve the timing by?

I don't know how to simulate that easily.

It's also the complete opposite of what I need.  Just running the subset
of non-container tests I'm working on is already fast, using “make -j8
subdirs=…” (really fast if I don't run “make -j8” before for safety).

Thanks,
Florian
Reply | Threaded
Open this post in threaded view
|

Re: RFC: make fastcheck

Szabolcs Nagy-2
In reply to this post by Carlos O'Donell-6
On 18/07/2019 12:01, Carlos O'Donell wrote:
> However, testing, even if we fix the build infrastructure, will always
> grow in duration, and we need a way to handle this with a "profile" which
> allows developers to smoke test their build.

$ grep -r 'sleep (' |grep /tst-

cleaning up some of those tests may help too.
Reply | Threaded
Open this post in threaded view
|

Re: RFC: make fastcheck

Cyril Hrubis
In reply to this post by Carlos O'Donell-6
Hi!
One of the critical things in making the tests faster is actually paying
attention to the tests themselves. In the Linux Test Project I saved a
minutes of runtime just by sorting the tests by runtime and reviewing the
outliners. In half of the cases the time could be reduced significantly
just by making a small bit of an effort.

One of the most common problems in our tests was sleep() being used as a
poor man's synchronization primitive. Greping libc sources there are 45
tests that call sleep() and more than half of that is in the nptl/
directory so there may be a room for improvement.

We had similar situation in LTP a few years ago and I've fixed that by
writing a test library that provides several replacements for the
sleep().

One of them is really simple checkpoint synchronization library based on
futexes that allows to suspend processes until the other process is
ready and wakes it up.

Another one is basically a loop that polls /proc/$pid/stat which is used
when process needs to wait for another process that suspends itself in a
sleep state.

And we also have a macro that implements bussy loop wait with
exponential increase in sleep time that replaces the
while (not_ready) sleep(1); loops which almost always wastes 1 second of
runtime.

--
Cyril Hrubis
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: RFC: make fastcheck

Joseph Myers
On Thu, 18 Jul 2019, Cyril Hrubis wrote:

> One of the most common problems in our tests was sleep() being used as a
> poor man's synchronization primitive. Greping libc sources there are 45
> tests that call sleep() and more than half of that is in the nptl/
> directory so there may be a room for improvement.

A separate problem in nptl/ is that it disables parallel build for
testing.

# The tests here better do not run in parallel
ifneq ($(filter %tests,$(MAKECMDGOALS)),)
.NOTPARALLEL:
endif

Even if the tests can't safely *run* in parallel, it should be possible to
*compile* them in parallel.

--
Joseph S. Myers
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: RFC: make fastcheck

Yann Droneaud
Hi,

Le jeudi 25 juillet 2019 à 20:24 +0000, Joseph Myers a écrit :

>
> A separate problem in nptl/ is that it disables parallel build for
> testing.
>
> # The tests here better do not run in parallel
> ifneq ($(filter %tests,$(MAKECMDGOALS)),)
> .NOTPARALLEL:
> endif
>
> Even if the tests can't safely *run* in parallel, it should be
> possible to
> *compile* them in parallel.
>

I was hoping there was a dedicated make target to build all the tests,
so that they could be built safely with make -j ..., another target to
build and setup the tests, and make check to build,setup and run the
tests.

Sadly I don't have time and or will to implement such behavior.

Regards.

--
Yann Droneaud
OPTEYA


Reply | Threaded
Open this post in threaded view
|

Re: RFC: make fastcheck

DJ Delorie-2
In reply to this post by Joseph Myers

Perhaps something new in support/* that sets us serialize such tests
internally to the test?

Or some makefile hacks to serialize those tests without needing
.NOTPARALLEL?  Like, using tst-a.out:|tst-b.out and $(eval) ?
Reply | Threaded
Open this post in threaded view
|

Re: RFC: make fastcheck

Carlos O'Donell-6
On 7/25/19 5:07 PM, DJ Delorie wrote:
>
> Perhaps something new in support/* that sets us serialize such tests
> internally to the test?
>
> Or some makefile hacks to serialize those tests without needing
> .NOTPARALLEL?  Like, using tst-a.out:|tst-b.out and $(eval) ?
>

This is a great idea.

I think that the support infrastructure could be told it's running
a threaded test, and then allow them to be parallelized up to a
point, perhaps a load factor, or something else.

The first step would obviously be:

* For all nptl tests support serializes them with a lock file.
* Remove .NOTPARALLEL.

--
Cheers,
Carlos.
Reply | Threaded
Open this post in threaded view
|

Re: RFC: make fastcheck

DJ Delorie-2
"Carlos O'Donell" <[hidden email]> writes:
> I think that the support infrastructure could be told it's running
> a threaded test, and then allow them to be parallelized up to a
> point, perhaps a load factor, or something else.
>
> The first step would obviously be:
>
> * For all nptl tests support serializes them with a lock file.
> * Remove .NOTPARALLEL.

Given the huge number of tests in nptl, I'd try a makefile-based
serialization first, since the list could be automated.  I'm testing
this (requires make 3.80) and it seems to work, but not perfectly (maybe
it's just remnants of tests still running where ps can see them, but
there are two real FAILs for some reason).  Saves about 1.5 minutes (out
of 15) off a "rm -rf nptl; make; make check"

(and yes, I know, the tests are run in reverse order ;)

diff --git a/nptl/Makefile b/nptl/Makefile
index 0567e77a79..1eb7fd4286 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -666,7 +666,7 @@ $(objpfx)tst-cleanup0.out: /dev/null $(objpfx)tst-cleanup0
  $(evaluate-test)
 
 $(objpfx)tst-cleanup0-cmp.out: tst-cleanup0.expect $(objpfx)tst-cleanup0.out
- cmp $^ > $@; \
+ cmp tst-cleanup0.expect $(objpfx)tst-cleanup0.out > $@; \
  $(evaluate-test)
 
 $(objpfx)crti.o: $(objpfx)pt-crti.o
@@ -723,7 +723,14 @@ tst-audit-threads-ENV = LD_AUDIT=$(objpfx)tst-audit-threads-mod1.so
 
 CFLAGS-tst-unwind-thread.c += -funwind-tables
 
+ALLTESTS = $(tests:%=$(objpfx)%.out) $(tests-internal:%=$(objpfx)%.out) \
+ $(tests-container:%=$(objpfx)%.out) \
+ $(tests-special) $(tests-printers-out) \
+ $(xtests:%=$(objpfx)%.out) $(xtests-special)
+
+ALLBUTFIRSTTEST = $(filter-out $(firstword $(ALLTESTS)), $(ALLTESTS))
+ALLBUTLASTTEST = $(filter-out $(lastword $(ALLTESTS)), $(ALLTESTS))
+TESTPAIRS = $(join $(ALLBUTLASTTEST),$(addprefix :|,$(ALLBUTFIRSTTEST)))
+
 # The tests here better do not run in parallel
-ifneq ($(filter %tests,$(MAKECMDGOALS)),)
-.NOTPARALLEL:
-endif
+$(foreach pair,$(TESTPAIRS),$(eval $(pair)))
Reply | Threaded
Open this post in threaded view
|

Re: RFC: make fastcheck

DJ Delorie-2
In reply to this post by Carlos O'Donell-6

> but there are two real FAILs for some reason
> (and yes, I know, the tests are run in reverse order ;)

Reversing the order fixed the two fails.  Fragile :-P
Reply | Threaded
Open this post in threaded view
|

Re: RFC: make fastcheck

Carlos O'Donell-6
On 7/25/19 10:39 PM, DJ Delorie wrote:
>
>> but there are two real FAILs for some reason
>> (and yes, I know, the tests are run in reverse order ;)
>
> Reversing the order fixed the two fails.  Fragile :-P
>

Wait. What? Why would it fail? That needs investigation...

--
Cheers,
Carlos.
Reply | Threaded
Open this post in threaded view
|

Re: RFC: make fastcheck

DJ Delorie-2

"Carlos O'Donell" <[hidden email]> writes:
> Wait. What? Why would it fail? That needs investigation...

It looks like there are sometimes pairs of tests like "run test" and
"compare output of previous test to expected pattern"

If you do the compare first, there's no output to compare to.

Yes, there should be dependencies, but the .NOTPARALLEL meant they
were not needed.
Reply | Threaded
Open this post in threaded view
|

Re: RFC: make fastcheck

Carlos O'Donell-6
On 7/26/19 1:19 PM, DJ Delorie wrote:

>
> "Carlos O'Donell" <[hidden email]> writes:
>> Wait. What? Why would it fail? That needs investigation...
>
> It looks like there are sometimes pairs of tests like "run test" and
> "compare output of previous test to expected pattern"
>
> If you do the compare first, there's no output to compare to.
>
> Yes, there should be dependencies, but the .NOTPARALLEL meant they
> were not needed.

OK, thanks, these are the kinds of things we'd need to fix if we wanted
to entirely remove the parallel running.

But your fix works though and you shaved some time off the build right?

Please post that valuable patch for review for 2.31 :-)

--
Cheers,
Carlos.
Reply | Threaded
Open this post in threaded view
|

Re: RFC: make fastcheck

DJ Delorie-2

"Carlos O'Donell" <[hidden email]> writes:
> Please post that valuable patch for review for 2.31 :-)

Formal patch to follow once the testing and bikeshedding is done ;-)

Timing... on my 4/8-core i7 (i7-4790K, 4GHz), saves about 37 seconds of
build-the-tests time. (make -j8 nptl/tests run-built-tests=no, 48s->11s)

diff --git a/nptl/Makefile b/nptl/Makefile
index 0567e77a79..d7a3857c95 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -666,7 +666,7 @@ $(objpfx)tst-cleanup0.out: /dev/null $(objpfx)tst-cleanup0
  $(evaluate-test)
 
 $(objpfx)tst-cleanup0-cmp.out: tst-cleanup0.expect $(objpfx)tst-cleanup0.out
- cmp $^ > $@; \
+ cmp tst-cleanup0.expect $(objpfx)tst-cleanup0.out > $@; \
  $(evaluate-test)
 
 $(objpfx)crti.o: $(objpfx)pt-crti.o
@@ -723,7 +723,23 @@ tst-audit-threads-ENV = LD_AUDIT=$(objpfx)tst-audit-threads-mod1.so
 
 CFLAGS-tst-unwind-thread.c += -funwind-tables
 
-# The tests here better do not run in parallel
-ifneq ($(filter %tests,$(MAKECMDGOALS)),)
-.NOTPARALLEL:
+ifeq ($(run-built-tests),yes)
+# The tests in this subdir should not be run in parallel.
+#
+# The following will create rules like "foo2.out :| foo1.out" for all
+# tests, which forces the tests to be run serially, but does not force
+# a test to be run just because some other test was run.
+#
+# Caveat: the :|-style dependencies won't be listed in $^, so avoid
+# using $^ to depend on test result files.
+
+ALLTESTS = $(tests:%=$(objpfx)%.out) $(tests-internal:%=$(objpfx)%.out) \
+ $(tests-container:%=$(objpfx)%.out) \
+ $(tests-special) $(tests-printers-out) \
+ $(xtests:%=$(objpfx)%.out) $(xtests-special)
+
+ALLBUTFIRSTTEST = $(filter-out $(firstword $(ALLTESTS)), $(ALLTESTS))
+ALLBUTLASTTEST = $(filter-out $(lastword $(ALLTESTS)), $(ALLTESTS))
+TESTPAIRS = $(join $(ALLBUTFIRSTTEST),$(addprefix :|,$(ALLBUTLASTTEST)))
+$(foreach pair,$(TESTPAIRS),$(eval $(pair)))
 endif