How to handle long running tests in nptl.

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

How to handle long running tests in nptl.

Stefan Liebler-2
Hi,

the nptl tests are currently running in sequence. Some of them are
running very long:
-tst-cond16: 20s
-tst-cond17: 20s
-tst-cond18: 20s
-tst-rwlock-tryrdlock-stall: 20s
-tst-mutex10: 16s
-tst-rwlock20: 10s
-tst-rwlock-trywrlock-stall: 10s
-tst-rwlock-pwn: 10s

The listed tests are responsible for over two minutes of runtime of
"make check". They all are running a test in a loop for a large amount
of iterations or seconds in order to trigger e.g. a race condition.

How to handle those long running tests with respect of "make check" runtime?
- reduce runtime by reducing number of iterations or seconds
- move those tests to "make xcheck"
- reduce runtime while running "make check" and rerun with unchanged
runtime in "make xcheck"
- change nothing
- other ideas?

Bye,
Stefan

Reply | Threaded
Open this post in threaded view
|

Re: How to handle long running tests in nptl.

Carlos O'Donell-6
On 9/12/19 9:21 AM, Stefan Liebler wrote:

> Hi,
>
> the nptl tests are currently running in sequence. Some of them are running very long:
> -tst-cond16: 20s
> -tst-cond17: 20s
> -tst-cond18: 20s
> -tst-rwlock-tryrdlock-stall: 20s
> -tst-mutex10: 16s
> -tst-rwlock20: 10s
> -tst-rwlock-trywrlock-stall: 10s
> -tst-rwlock-pwn: 10s
>
> The listed tests are responsible for over two minutes of runtime of "make check". They all are running a test in a loop for a large amount of iterations or seconds in order to trigger e.g. a race condition.
>
> How to handle those long running tests with respect of "make check" runtime?
> - reduce runtime by reducing number of iterations or seconds
> - move those tests to "make xcheck"
> - reduce runtime while running "make check" and rerun with unchanged runtime in "make xcheck"
> - change nothing
> - other ideas?

The use of 'make xcheck' is for tests that need specific persmissions
to run, like root.

This issue has come up already with regards to the test-in-container
tests because it takes time to do installed-tree testing because you
have to update the installed tree.

I would like to see a discussion around:

- What do developers want from day-to-day 'make check?'
  - Do you want 'make check' to take a constant amount of time?
  - Do you want 'make check' to give you good coverage?

- What do developers want to run before checkin?
  - Is "before checkin" different from day-to-day testing?

This issue is a topic for GNU Cauldron 2019:
https://sourceware.org/glibc/wiki/Cauldron2019
~~~
Update the glibc build infrastructure

    Discuss getting accurate dependency information into the build system so we can do incremental build and test in as fast a time as possible.
    Breaking up tests? Fast. Short. Fixed time.
~~~

--
Cheers,
Carlos.
Reply | Threaded
Open this post in threaded view
|

Re: How to handle long running tests in nptl.

Cyril Hrubis
In reply to this post by Stefan Liebler-2
Hi!
> How to handle those long running tests with respect of "make check" runtime?
> - reduce runtime by reducing number of iterations or seconds
> - move those tests to "make xcheck"
> - reduce runtime while running "make check" and rerun with unchanged
> runtime in "make xcheck"
> - change nothing
> - other ideas?

We do have a nice fuzzy sync library in LTP that is designed especially
to trigger race conditions, which was developed to speed up LTP CVE
related tests. With this library runtime for race test can usually be
reduced from minutes to couple of seconds.

See:
https://github.com/linux-test-project/ltp/blob/master/include/tst_fuzzy_sync.h

We also plan to prepare a talk about this library for next FOSDEM in
order to spare people from reinventing the wheel again and again.

--
Cyril Hrubis
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: How to handle long running tests in nptl.

Florian Weimer-5
In reply to this post by Carlos O'Donell-6
* Carlos O'Donell:

> On 9/12/19 9:21 AM, Stefan Liebler wrote:
>> Hi,
>>
>> the nptl tests are currently running in sequence. Some of them are running very long:
>> -tst-cond16: 20s
>> -tst-cond17: 20s
>> -tst-cond18: 20s
>> -tst-rwlock-tryrdlock-stall: 20s
>> -tst-mutex10: 16s
>> -tst-rwlock20: 10s
>> -tst-rwlock-trywrlock-stall: 10s
>> -tst-rwlock-pwn: 10s
>>
>> The listed tests are responsible for over two minutes of runtime of "make check". They all are running a test in a loop for a large amount of iterations or seconds in order to trigger e.g. a race condition.
>>
>> How to handle those long running tests with respect of "make check" runtime?
>> - reduce runtime by reducing number of iterations or seconds
>> - move those tests to "make xcheck"
>> - reduce runtime while running "make check" and rerun with unchanged runtime in "make xcheck"
>> - change nothing
>> - other ideas?
>
> The use of 'make xcheck' is for tests that need specific persmissions
> to run, like root.

It's also used for tests that require many minutes to run or require
special firewall settings, e.g., resolv/tst-resolv-qtypes.

Unfortunately, we do not capture data of legitimate test failures, so
it's hard to tell how valuable such tests are.

Many of the timeout-heavy nptl tests do not actually need a quiet
system, though, so they could be moved to a separate subdirectory that
contains only such tests.  Or we could add some markup to the tests and
add a more intelligent test scheduler.

Thanks,
Florian

Reply | Threaded
Open this post in threaded view
|

Re: How to handle long running tests in nptl.

Carlos O'Donell-6
On 12/13/19 6:15 AM, Florian Weimer wrote:

> * Carlos O'Donell:
>
>> On 9/12/19 9:21 AM, Stefan Liebler wrote:
>>> Hi,
>>>
>>> the nptl tests are currently running in sequence. Some of them are running very long:
>>> -tst-cond16: 20s
>>> -tst-cond17: 20s
>>> -tst-cond18: 20s
>>> -tst-rwlock-tryrdlock-stall: 20s
>>> -tst-mutex10: 16s
>>> -tst-rwlock20: 10s
>>> -tst-rwlock-trywrlock-stall: 10s
>>> -tst-rwlock-pwn: 10s
>>>
>>> The listed tests are responsible for over two minutes of runtime of "make check". They all are running a test in a loop for a large amount of iterations or seconds in order to trigger e.g. a race condition.
>>>
>>> How to handle those long running tests with respect of "make check" runtime?
>>> - reduce runtime by reducing number of iterations or seconds
>>> - move those tests to "make xcheck"
>>> - reduce runtime while running "make check" and rerun with unchanged runtime in "make xcheck"
>>> - change nothing
>>> - other ideas?
>>
>> The use of 'make xcheck' is for tests that need specific persmissions
>> to run, like root.
>
> It's also used for tests that require many minutes to run or require
> special firewall settings, e.g., resolv/tst-resolv-qtypes.
>
> Unfortunately, we do not capture data of legitimate test failures, so
> it's hard to tell how valuable such tests are.
>
> Many of the timeout-heavy nptl tests do not actually need a quiet
> system, though, so they could be moved to a separate subdirectory that
> contains only such tests.  Or we could add some markup to the tests and
> add a more intelligent test scheduler.

Let me set what I see is in or out of scope. Feel free to argue either way.

The quality of tests is out of scope for this discussion. Consider that
the project grows the number of regression tests into infinity, and consider
what a process would look like for handling this.

Recording the PASS/FAIL metrics is out of scope for this discussion.
I consider it a distinct discussion to make a database and start recording
the results of our test runs (which can grow out of CI infrastructure).

In scope is the tests themselves and how we group them (profiles) and what
tests are run by default and when.

We need to define what is a reasonable amount of time for a developer to
wait for make check to complete.

How long is this? What does it cover?

We need to define a broader testing target that includes longer running tests
that do not require root or special privileges. Some of these tests can also
be moved into test-container tests. DJ and I were talking about trying to
simulate setuid root in the container for DT_RUNPATH/DT_RPATH DST testing
(I have *hundreds* of permutation tests to verify correct operation of the
ELF gABI DST handling rewrite).

I would like to see make xcheck deprecated, and all tests moved, rewritten
to test-container, or removed. We should be able to do everything in a container.

I think make check could take an argument.

make check (Run default test profile "p=dev")
make xcheck (Deprecated, force users to consciously pick a profile)
make p=dev check (Run developer test profile, run iteratively by developer)
make p=rel check (Run release test profile, run minimally at release boundaries or by distro builds)
make p=torture check (Run torture test profile, run minimally at release boundaries)
make p=ci check (Run CI test profile adequate for CI testing, run per commit, limit to control CI costs/time)

Where 'p' is the profile of the test you want and the profile can be assembled
based on the requirements at hand for the person running the test.

Next steps:
- Define a developer test profile.
- Have 'make check' default to the developer test profile.
- Add the capability for other profiles to exist.
- Deprecate xcheck.
- Migrate xheck tests to profiles and in the process either rewrite or remove them.
- Continue active work to reshuffle the profiles as the project evolves.

Thoughts?

--
Cheers,
Carlos.