New glibc-testfail io/tst-copy_file_range with kernel-next.

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

New glibc-testfail io/tst-copy_file_range with kernel-next.

Stefan Liebler-2
Hi,

as information, I had the chance to run the glibc-testsuite on a
kernel-next from today on s390x and recognized a new failing test:
io/tst-copy_file_range

It seems as the patches from Amir Golstein are changing the behaviour of
copy_file_range. See two of the series:
-"vfs: allow copy_file_range to copy across devices"
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=5dae222a5ff0c269730393018a5539cc970a4726
-"vfs: add missing checks to copy_file_range"
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=96e6e8f4a68df2d94800311163faa67124df24e5

A quick look into the verbose output (see attached file) shows at least
the following changes:

- writing at a position past the maximum allowed offset with "write"
fails with EINVAL (22) whereas "copy_file_range" is now returning EFBIG
(27). The test assumes that the same errno is set for "write" and
"copy_file_range". (See <glibc>/io/tst-copy_file_range.c in
delayed_write_failure_beginning() with current_size=1 or the second copy
in delayed_write_failure_end())
According to http://man7.org/linux/man-pages/man2/copy_file_range.2.html 
and http://man7.org/linux/man-pages/man2/write.2.html EFBIG seems to be
the correct error.
Should "write" also return EFBIG in those cases?

- For delayed_write_failure_beginning() with current_size>=2
copy_file_range is started at a valid offset, but the length is beyond a
valid range. copy_file_range is now copying the "valid" bytes and
returning the number of copied bytes. The old behaviour was to return
EINVAL without copying anything.
In find_maximum_offset() a test sets maximum_offset_hard_limit to
true/false depending on the behaviour of "write" in such a situation
and the tests in delayed_write_failure_beginning() are assuming that
"copy_file_range" behaves like "write".
Should "write" perform the same partial copies as "copy_file_range" or
how to determine the setting of maximum_offset_hard_limit?

- In cross_device_failure it is assumed that copy_file_range always
fails with EXDEV. Amirs patches are now allowing this case if possible.
How could the testcase determine if the kernel supports cross device
copies or not?


How do we usually handle those kernel-next changes?
As soon as there is an official released kernel or before?

Bye
Stefan

tst-copy_file_range.out.verbose (150K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New glibc-testfail io/tst-copy_file_range with kernel-next.

Florian Weimer-5
* Stefan Liebler:

> as information, I had the chance to run the glibc-testsuite on a
> kernel-next from today on s390x and recognized a new failing test:
> io/tst-copy_file_range

Thanks for looking at this and summarizing the changes.

There's been a previous thread here:

  <https://sourceware.org/ml/libc-alpha/2019-06/msg00039.html>

I can back out the emulation, as discussed, then there wouldn't be
anything left to test in theory (although I think our tests caught one
or two bugs in XFS backports downstream, that as before fstests covered
copy_file_range).

It's not easy for me to test linux-next, but I can try to tweak the test
so that it copes with the different error codes.

The cross-device test we should just drop.

Thanks,
Florian
Reply | Threaded
Open this post in threaded view
|

Re: New glibc-testfail io/tst-copy_file_range with kernel-next.

Amir Goldstein
In reply to this post by Stefan Liebler-2
On Wed, Jun 26, 2019 at 4:57 PM Stefan Liebler <[hidden email]> wrote:
>
> Hi,
>
> as information, I had the chance to run the glibc-testsuite on a
> kernel-next from today on s390x and recognized a new failing test:
> io/tst-copy_file_range

Thanks for the detailed report!

[cc: linux-api and linux-fsdevel]

>
> It seems as the patches from Amir Golstein are changing the behaviour of
> copy_file_range. See two of the series:
> -"vfs: allow copy_file_range to copy across devices"
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=5dae222a5ff0c269730393018a5539cc970a4726
> -"vfs: add missing checks to copy_file_range"
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=96e6e8f4a68df2d94800311163faa67124df24e5
>
> A quick look into the verbose output (see attached file) shows at least
> the following changes:
>
> - writing at a position past the maximum allowed offset with "write"
> fails with EINVAL (22) whereas "copy_file_range" is now returning EFBIG
> (27). The test assumes that the same errno is set for "write" and
> "copy_file_range". (See <glibc>/io/tst-copy_file_range.c in
> delayed_write_failure_beginning() with current_size=1 or the second copy
> in delayed_write_failure_end())
> According to http://man7.org/linux/man-pages/man2/copy_file_range.2.html
> and http://man7.org/linux/man-pages/man2/write.2.html EFBIG seems to be
> the correct error.
> Should "write" also return EFBIG in those cases?

I'm not sure.
I think it makes sense for copy_file_range()  to behave very similarly to
"read"+"write" that is what I was aiming for. However, copy_file_range()
can be called with or without pos/offset. When called with offset, it would
be better to try and match its behavior with pread/pwrite. Note the EINVAL
case in http://man7.org/linux/man-pages/man2/pwritev.2.html
when offset+len overflows ssize_t.

Also, please see planned changes to copy_file_range() man page:
https://github.com/amir73il/man-pages/commit/ef24cb90797552eb0a2c55a1fb7f2c275e3b1bdb

>
> - For delayed_write_failure_beginning() with current_size>=2
> copy_file_range is started at a valid offset, but the length is beyond a
> valid range. copy_file_range is now copying the "valid" bytes and
> returning the number of copied bytes. The old behaviour was to return
> EINVAL without copying anything.
> In find_maximum_offset() a test sets maximum_offset_hard_limit to
> true/false depending on the behaviour of "write" in such a situation
> and the tests in delayed_write_failure_beginning() are assuming that
> "copy_file_range" behaves like "write".
> Should "write" perform the same partial copies as "copy_file_range" or
> how to determine the setting of maximum_offset_hard_limit?
>
> - In cross_device_failure it is assumed that copy_file_range always
> fails with EXDEV. Amirs patches are now allowing this case if possible.
> How could the testcase determine if the kernel supports cross device
> copies or not?
>
>

Florian's response:

> There's been a previous thread here:
>
>  <https://sourceware.org/ml/libc-alpha/2019-06/msg00039.html>
>
> I can back out the emulation, as discussed, then there wouldn't be
> anything left to test in theory (although I think our tests caught one
> or two bugs in XFS backports downstream, that as before fstests covered
> copy_file_range).

I agree that sounds like the best option.

Thanks,
Amir.
Reply | Threaded
Open this post in threaded view
|

Re: New glibc-testfail io/tst-copy_file_range with kernel-next.

Amir Goldstein
In reply to this post by Florian Weimer-5
On Wed, Jun 26, 2019 at 5:18 PM Florian Weimer <[hidden email]> wrote:

>
> * Stefan Liebler:
>
> > as information, I had the chance to run the glibc-testsuite on a
> > kernel-next from today on s390x and recognized a new failing test:
> > io/tst-copy_file_range
>
> Thanks for looking at this and summarizing the changes.
>
> There's been a previous thread here:
>
>   <https://sourceware.org/ml/libc-alpha/2019-06/msg00039.html>
>
> I can back out the emulation, as discussed, then there wouldn't be
> anything left to test in theory (although I think our tests caught one
> or two bugs in XFS backports downstream, that as before fstests covered
> copy_file_range).
>
> It's not easy for me to test linux-next, but I can try to tweak the test
> so that it copes with the different error codes.
>
> The cross-device test we should just drop.
>

That sounds good to me.
Sorry, Florian, I included your response in my reply to linux-api,
but forgot to CC you.

Thanks,
Amir.
Reply | Threaded
Open this post in threaded view
|

Re: New glibc-testfail io/tst-copy_file_range with kernel-next.

Stefan Liebler-2
In reply to this post by Florian Weimer-5
On 6/26/19 4:18 PM, Florian Weimer wrote:

> * Stefan Liebler:
>
>> as information, I had the chance to run the glibc-testsuite on a
>> kernel-next from today on s390x and recognized a new failing test:
>> io/tst-copy_file_range
>
> Thanks for looking at this and summarizing the changes.
>
> There's been a previous thread here:
>
>    <https://sourceware.org/ml/libc-alpha/2019-06/msg00039.html>
>
> I can back out the emulation, as discussed, then there wouldn't be
> anything left to test in theory (although I think our tests caught one
> or two bugs in XFS backports downstream, that as before fstests covered
> copy_file_range).
>
> It's not easy for me to test linux-next, but I can try to tweak the test
> so that it copes with the different error codes.
>
> The cross-device test we should just drop.
>
> Thanks,
> Florian
>

Just for reference:
here are the links to Florians patch / commit which removes part of the
tests. Now the test is passing on the mentioned kernel-next:

"[PATCH] io: Remove copy_file_range emulation"
https://www.sourceware.org/ml/libc-alpha/2019-06/msg00873.html

"io: Remove copy_file_range emulation [BZ #24744]"
https://sourceware.org/git/?p=glibc.git;a=commit;h=5a659ccc0ec217ab02a4c273a1f6d346a359560a

Thanks,
Stefan