XFS reports lchmod failure, but changes file system contents

classic Classic list List threaded Threaded
22 messages Options
12
Reply | Threaded
Open this post in threaded view
|

XFS reports lchmod failure, but changes file system contents

Florian Weimer
In principle, Linux supports lchmod via O_PATH descriptors and chmod
on /proc/self/fd.  (lchmod is the non-symbolic-link-following variant
of chmod.)

This helper program can be used to do this:

#define _GNU_SOURCE
#include <err.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <unistd.h>

int
main (int argc, char **argv)
{
  if (argc != 3)
    {
      fprintf (stderr, "usage: %s MODE FILE\n", argv[0]);
      return 2;
    }

  unsigned int mode;
  if (sscanf (argv[1], "%o", &mode) != 1
      || mode != (mode_t) mode)
    errx (1, "invalid mode: %s", argv[1]);

  int fd = open (argv[2], O_PATH | O_NOFOLLOW);
  if (fd < 0)
    err (1, "open");

  char *fd_path;
  if (asprintf (&fd_path, "/proc/self/fd/%d", fd) < 0)
    err (1, "asprintf");

  if (chmod (fd_path, mode) != 0)
    err (1, "chmod");

  free (fd_path);
  if (close (fd) != 0)
    err (1, "close");

  return 0;
}

When changing the permissions of on XFS in this way, the chmod
operation fails:

$ ln -s does-not-exist /var/tmp/symlink
$ ls -l /var/tmp/symlink
lrwxrwxrwx. 1 fweimer fweimer 14 Feb 12 12:41 /var/tmp/symlink -> does-not-exist
$ strace ./lchmod 0 /var/tmp/symlink
[…]
openat(AT_FDCWD, "/var/tmp/symlink", O_RDONLY|O_NOFOLLOW|O_PATH) = 3
[…]
chmod("/proc/self/fd/3", 000)           = -1 EOPNOTSUPP (Operation not supported)
write(2, "lchmod: ", 8lchmod: )                 = 8
write(2, "chmod", 5chmod)                    = 5
write(2, ": Operation not supported\n", 26: Operation not supported
) = 26
exit_group(1)                           = ?

But the file system contents has changed nevertheless:

$ ls -l /var/tmp/symlink
l---------. 1 fweimer fweimer 14 Feb 12 12:41 /var/tmp/symlink -> does-not-exist
$ echo 3 | sudo tee /proc/sys/vm/drop_caches
$ ls -l /var/tmp/symlink
l---------. 1 fweimer fweimer 14 Feb 12 12:41 /var/tmp/symlink -> does-not-exist

This looks like an XFS bug to me.  With tmpfs, the chmod succeeds and
is reflected in the file system.

This bug also affects regular files, not just symbolic links.

It causes the io/tst-lchmod glibc test to fail (after it has been
fixed, the in-tree version has another bug).
Reply | Threaded
Open this post in threaded view
|

Re: XFS reports lchmod failure, but changes file system contents

Florian Weimer
* Florian Weimer:

> In principle, Linux supports lchmod via O_PATH descriptors and chmod
> on /proc/self/fd.  (lchmod is the non-symbolic-link-following variant
> of chmod.)
>
> This helper program can be used to do this:
>
> #define _GNU_SOURCE
> #include <err.h>
> #include <fcntl.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <sys/stat.h>
> #include <unistd.h>
>
> int
> main (int argc, char **argv)
> {
>   if (argc != 3)
>     {
>       fprintf (stderr, "usage: %s MODE FILE\n", argv[0]);
>       return 2;
>     }
>
>   unsigned int mode;
>   if (sscanf (argv[1], "%o", &mode) != 1
>       || mode != (mode_t) mode)
>     errx (1, "invalid mode: %s", argv[1]);
>
>   int fd = open (argv[2], O_PATH | O_NOFOLLOW);
>   if (fd < 0)
>     err (1, "open");
>
>   char *fd_path;
>   if (asprintf (&fd_path, "/proc/self/fd/%d", fd) < 0)
>     err (1, "asprintf");
>
>   if (chmod (fd_path, mode) != 0)
>     err (1, "chmod");
>
>   free (fd_path);
>   if (close (fd) != 0)
>     err (1, "close");
>
>   return 0;
> }
>
> When changing the permissions of on XFS in this way, the chmod
> operation fails:
>
> $ ln -s does-not-exist /var/tmp/symlink
> $ ls -l /var/tmp/symlink
> lrwxrwxrwx. 1 fweimer fweimer 14 Feb 12 12:41 /var/tmp/symlink -> does-not-exist
> $ strace ./lchmod 0 /var/tmp/symlink
> […]
> openat(AT_FDCWD, "/var/tmp/symlink", O_RDONLY|O_NOFOLLOW|O_PATH) = 3
> […]
> chmod("/proc/self/fd/3", 000)           = -1 EOPNOTSUPP (Operation not supported)
> write(2, "lchmod: ", 8lchmod: )                 = 8
> write(2, "chmod", 5chmod)                    = 5
> write(2, ": Operation not supported\n", 26: Operation not supported
> ) = 26
> exit_group(1)                           = ?
>
> But the file system contents has changed nevertheless:
>
> $ ls -l /var/tmp/symlink
> l---------. 1 fweimer fweimer 14 Feb 12 12:41 /var/tmp/symlink -> does-not-exist
> $ echo 3 | sudo tee /proc/sys/vm/drop_caches
> $ ls -l /var/tmp/symlink
> l---------. 1 fweimer fweimer 14 Feb 12 12:41 /var/tmp/symlink -> does-not-exist
>
> This looks like an XFS bug to me.  With tmpfs, the chmod succeeds and
> is reflected in the file system.
>
> This bug also affects regular files, not just symbolic links.
>
> It causes the io/tst-lchmod glibc test to fail (after it has been
> fixed, the in-tree version has another bug).

Sorry, I forgot to mention that I see this with Debian's
4.19.67-2+deb10u2 kernel and Fedora's 5.4.17-200.fc31.x86_64 kernel.
Reply | Threaded
Open this post in threaded view
|

Re: XFS reports lchmod failure, but changes file system contents

Darrick J. Wong
In reply to this post by Florian Weimer
On Wed, Feb 12, 2020 at 12:48:49PM +0100, Florian Weimer wrote:

> In principle, Linux supports lchmod via O_PATH descriptors and chmod
> on /proc/self/fd.  (lchmod is the non-symbolic-link-following variant
> of chmod.)
>
> This helper program can be used to do this:
>
> #define _GNU_SOURCE
> #include <err.h>
> #include <fcntl.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <sys/stat.h>
> #include <unistd.h>
>
> int
> main (int argc, char **argv)
> {
>   if (argc != 3)
>     {
>       fprintf (stderr, "usage: %s MODE FILE\n", argv[0]);
>       return 2;
>     }
>
>   unsigned int mode;
>   if (sscanf (argv[1], "%o", &mode) != 1
>       || mode != (mode_t) mode)
>     errx (1, "invalid mode: %s", argv[1]);
>
>   int fd = open (argv[2], O_PATH | O_NOFOLLOW);
>   if (fd < 0)
>     err (1, "open");
>
>   char *fd_path;
>   if (asprintf (&fd_path, "/proc/self/fd/%d", fd) < 0)
>     err (1, "asprintf");
>
>   if (chmod (fd_path, mode) != 0)
>     err (1, "chmod");
>
>   free (fd_path);
>   if (close (fd) != 0)
>     err (1, "close");
>
>   return 0;
> }
>
> When changing the permissions of on XFS in this way, the chmod
> operation fails:
>
> $ ln -s does-not-exist /var/tmp/symlink
> $ ls -l /var/tmp/symlink
> lrwxrwxrwx. 1 fweimer fweimer 14 Feb 12 12:41 /var/tmp/symlink -> does-not-exist
> $ strace ./lchmod 0 /var/tmp/symlink
> […]
> openat(AT_FDCWD, "/var/tmp/symlink", O_RDONLY|O_NOFOLLOW|O_PATH) = 3
> […]
> chmod("/proc/self/fd/3", 000)           = -1 EOPNOTSUPP (Operation not supported)
> write(2, "lchmod: ", 8lchmod: )                 = 8
> write(2, "chmod", 5chmod)                    = 5
> write(2, ": Operation not supported\n", 26: Operation not supported
> ) = 26
> exit_group(1)                           = ?
>
> But the file system contents has changed nevertheless:
>
> $ ls -l /var/tmp/symlink
> l---------. 1 fweimer fweimer 14 Feb 12 12:41 /var/tmp/symlink -> does-not-exist
> $ echo 3 | sudo tee /proc/sys/vm/drop_caches
> $ ls -l /var/tmp/symlink
> l---------. 1 fweimer fweimer 14 Feb 12 12:41 /var/tmp/symlink -> does-not-exist
>
> This looks like an XFS bug to me.  With tmpfs, the chmod succeeds and
> is reflected in the file system.
>
> This bug also affects regular files, not just symbolic links.
>
> It causes the io/tst-lchmod glibc test to fail (after it has been
> fixed, the in-tree version has another bug).

xfs_setattr_nonsize calls posix_acl_chmod which returns EOPNOTSUPP
because the xfs symlink inode_operations do not include a ->set_acl
pointer.

I /think/ that posix_acl_chmod code exists to enforce that the file mode
reflects any acl that might be set on the inode, but in this case the
inode is a symbolic link.

I don't remember off the top of my head if ACLs are supposed to apply to
symlinks, but what do you think about adding get_acl/set_acl pointers to
xfs_symlink_inode_operations and xfs_inline_symlink_inode_operations ?

--D
Reply | Threaded
Open this post in threaded view
|

Re: XFS reports lchmod failure, but changes file system contents

Christoph Hellwig
On Wed, Feb 12, 2020 at 08:16:04AM -0800, Darrick J. Wong wrote:

> xfs_setattr_nonsize calls posix_acl_chmod which returns EOPNOTSUPP
> because the xfs symlink inode_operations do not include a ->set_acl
> pointer.
>
> I /think/ that posix_acl_chmod code exists to enforce that the file mode
> reflects any acl that might be set on the inode, but in this case the
> inode is a symbolic link.
>
> I don't remember off the top of my head if ACLs are supposed to apply to
> symlinks, but what do you think about adding get_acl/set_acl pointers to
> xfs_symlink_inode_operations and xfs_inline_symlink_inode_operations ?

Symlinks don't have permissions or ACLs, so adding them makes no
sense.

xfs doesn't seem all that different from the other file systems,
so I suspect you'll also see it with other on-disk file systems.
We probably need a check high up in the chmod and co code to reject
the operation early for O_PATH file descriptors pointing to symlinks.
Reply | Threaded
Open this post in threaded view
|

Re: XFS reports lchmod failure, but changes file system contents

Darrick J. Wong
On Wed, Feb 12, 2020 at 10:11:28AM -0800, Christoph Hellwig wrote:

> On Wed, Feb 12, 2020 at 08:16:04AM -0800, Darrick J. Wong wrote:
> > xfs_setattr_nonsize calls posix_acl_chmod which returns EOPNOTSUPP
> > because the xfs symlink inode_operations do not include a ->set_acl
> > pointer.
> >
> > I /think/ that posix_acl_chmod code exists to enforce that the file mode
> > reflects any acl that might be set on the inode, but in this case the
> > inode is a symbolic link.
> >
> > I don't remember off the top of my head if ACLs are supposed to apply to
> > symlinks, but what do you think about adding get_acl/set_acl pointers to
> > xfs_symlink_inode_operations and xfs_inline_symlink_inode_operations ?
>
> Symlinks don't have permissions or ACLs, so adding them makes no
> sense.

Ahh, I thought so!

> xfs doesn't seem all that different from the other file systems,
> so I suspect you'll also see it with other on-disk file systems.

Yeah, I noticed that btrfs seems to exhibit the same behavior.

I also noticed that ext4 actually /does/ implement [gs]et_acl for
symlinks.

> We probably need a check high up in the chmod and co code to reject
> the operation early for O_PATH file descriptors pointing to symlinks.

<nod>

--D
Reply | Threaded
Open this post in threaded view
|

Re: XFS reports lchmod failure, but changes file system contents

Florian Weimer
In reply to this post by Christoph Hellwig
* Christoph Hellwig:

> xfs doesn't seem all that different from the other file systems,
> so I suspect you'll also see it with other on-disk file systems.
> We probably need a check high up in the chmod and co code to reject
> the operation early for O_PATH file descriptors pointing to symlinks.

We will change the glibc emulation to avoid trying to lchmod symbolic
links in this way.  This will avoid triggering the kernel bug.

(We'd really like to get a proper fchmodat system call with a flags
argument, though, for AT_EMPTY_PATH and AT_SYMLINK_NOFOLLOW.)

And part of my testing was wrong, this is a symbolic-link-only issue.
Reply | Threaded
Open this post in threaded view
|

Re: XFS reports lchmod failure, but changes file system contents

Christoph Hellwig
On Wed, Feb 12, 2020 at 07:50:01PM +0100, Florian Weimer wrote:
> * Christoph Hellwig:
>
> > xfs doesn't seem all that different from the other file systems,
> > so I suspect you'll also see it with other on-disk file systems.
> > We probably need a check high up in the chmod and co code to reject
> > the operation early for O_PATH file descriptors pointing to symlinks.
>
> We will change the glibc emulation to avoid trying to lchmod symbolic
> links in this way.  This will avoid triggering the kernel bug.

We'll still need to fix it.

> (We'd really like to get a proper fchmodat system call with a flags
> argument, though, for AT_EMPTY_PATH and AT_SYMLINK_NOFOLLOW.)

Send a patch (o find a minion to do so).
Reply | Threaded
Open this post in threaded view
|

Re: XFS reports lchmod failure, but changes file system contents

Florian Weimer
In reply to this post by Darrick J. Wong
* Darrick J. Wong:

> On Wed, Feb 12, 2020 at 10:11:28AM -0800, Christoph Hellwig wrote:
>> On Wed, Feb 12, 2020 at 08:16:04AM -0800, Darrick J. Wong wrote:
>> > xfs_setattr_nonsize calls posix_acl_chmod which returns EOPNOTSUPP
>> > because the xfs symlink inode_operations do not include a ->set_acl
>> > pointer.
>> >
>> > I /think/ that posix_acl_chmod code exists to enforce that the file mode
>> > reflects any acl that might be set on the inode, but in this case the
>> > inode is a symbolic link.
>> >
>> > I don't remember off the top of my head if ACLs are supposed to apply to
>> > symlinks, but what do you think about adding get_acl/set_acl pointers to
>> > xfs_symlink_inode_operations and xfs_inline_symlink_inode_operations ?
>>
>> Symlinks don't have permissions or ACLs, so adding them makes no
>> sense.
>
> Ahh, I thought so!
>
>> xfs doesn't seem all that different from the other file systems,
>> so I suspect you'll also see it with other on-disk file systems.
>
> Yeah, I noticed that btrfs seems to exhibit the same behavior.
>
> I also noticed that ext4 actually /does/ implement [gs]et_acl for
> symlinks.

Rich Felker noticed this, which may be related:

| Further, I've found some inconsistent behavior with ext4: chmod on the
| magic symlink fails with EOPNOTSUPP as in Florian's test, but fchmod
| on the O_PATH fd succeeds and changes the symlink mode. This is with
| 5.4. Cany anyone else confirm this? Is it a problem?

It looks broken to me because fchmod (as an inode-changing operation)
is not supposed to work on O_PATH descriptors.
Reply | Threaded
Open this post in threaded view
|

Re: XFS reports lchmod failure, but changes file system contents

Al Viro-4
On Wed, Feb 12, 2020 at 08:15:08PM +0100, Florian Weimer wrote:

> | Further, I've found some inconsistent behavior with ext4: chmod on the
> | magic symlink fails with EOPNOTSUPP as in Florian's test, but fchmod
> | on the O_PATH fd succeeds and changes the symlink mode. This is with
> | 5.4. Cany anyone else confirm this? Is it a problem?
>
> It looks broken to me because fchmod (as an inode-changing operation)
> is not supposed to work on O_PATH descriptors.

Why?  O_PATH does have an associated inode just fine; where does
that "not supposed to" come from?
Reply | Threaded
Open this post in threaded view
|

Re: XFS reports lchmod failure, but changes file system contents

Rich Felker-2
On Wed, Feb 12, 2020 at 07:51:18PM +0000, Al Viro wrote:

> On Wed, Feb 12, 2020 at 08:15:08PM +0100, Florian Weimer wrote:
>
> > | Further, I've found some inconsistent behavior with ext4: chmod on the
> > | magic symlink fails with EOPNOTSUPP as in Florian's test, but fchmod
> > | on the O_PATH fd succeeds and changes the symlink mode. This is with
> > | 5.4. Cany anyone else confirm this? Is it a problem?
> >
> > It looks broken to me because fchmod (as an inode-changing operation)
> > is not supposed to work on O_PATH descriptors.
>
> Why?  O_PATH does have an associated inode just fine; where does
> that "not supposed to" come from?

Indeed, my expectation was that both fchmod and chmod via the magic
symlink succeed, given a new enough kernel for operations on O_PATH
fds to be supported. I'd like to switch to using fstat+fchmod to do
this, and only fallback to /proc on older kernels where O_PATH fds
cause EBADF. But I'm somewhat concerned by the inconsistency between
behavior of the two approaches.

Rich
Reply | Threaded
Open this post in threaded view
|

Re: XFS reports lchmod failure, but changes file system contents

Florian Weimer
In reply to this post by Al Viro-4
* Al Viro:

> On Wed, Feb 12, 2020 at 08:15:08PM +0100, Florian Weimer wrote:
>
>> | Further, I've found some inconsistent behavior with ext4: chmod on the
>> | magic symlink fails with EOPNOTSUPP as in Florian's test, but fchmod
>> | on the O_PATH fd succeeds and changes the symlink mode. This is with
>> | 5.4. Cany anyone else confirm this? Is it a problem?
>>
>> It looks broken to me because fchmod (as an inode-changing operation)
>> is not supposed to work on O_PATH descriptors.
>
> Why?  O_PATH does have an associated inode just fine; where does
> that "not supposed to" come from?

It fails on most file systems right now.  I thought that was expected.
Other system calls (fsetxattr IIRC) do not work on O_PATH descriptors,
either.  I assumed that an O_PATH descriptor was not intending to
confer that capability.  Even openat fails.

Although fchmod does succeed on read-only descriptors, which is a bit
strange.
Reply | Threaded
Open this post in threaded view
|

Re: XFS reports lchmod failure, but changes file system contents

Andreas Schwab-2
On Feb 12 2020, Florian Weimer wrote:

> * Al Viro:
>
>> On Wed, Feb 12, 2020 at 08:15:08PM +0100, Florian Weimer wrote:
>>
>>> | Further, I've found some inconsistent behavior with ext4: chmod on the
>>> | magic symlink fails with EOPNOTSUPP as in Florian's test, but fchmod
>>> | on the O_PATH fd succeeds and changes the symlink mode. This is with
>>> | 5.4. Cany anyone else confirm this? Is it a problem?
>>>
>>> It looks broken to me because fchmod (as an inode-changing operation)
>>> is not supposed to work on O_PATH descriptors.
>>
>> Why?  O_PATH does have an associated inode just fine; where does
>> that "not supposed to" come from?
>
> It fails on most file systems right now.  I thought that was expected.
> Other system calls (fsetxattr IIRC) do not work on O_PATH descriptors,
> either.  I assumed that an O_PATH descriptor was not intending to
> confer that capability.  Even openat fails.

According to open(2), this is expected:

       O_PATH (since Linux 2.6.39)
              Obtain a file descriptor that can be used for two  purposes:  to
              indicate a location in the filesystem tree and to perform opera-
              tions that act purely at the file descriptor  level.   The  file
              itself  is not opened, and other file operations (e.g., read(2),
              write(2), fchmod(2), fchown(2), fgetxattr(2), ioctl(2), mmap(2))
              fail with the error EBADF.

Andreas.

--
Andreas Schwab, [hidden email]
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."
Reply | Threaded
Open this post in threaded view
|

Re: XFS reports lchmod failure, but changes file system contents

Rich Felker-2
In reply to this post by Florian Weimer
On Wed, Feb 12, 2020 at 09:01:22PM +0100, Florian Weimer wrote:

> * Al Viro:
>
> > On Wed, Feb 12, 2020 at 08:15:08PM +0100, Florian Weimer wrote:
> >
> >> | Further, I've found some inconsistent behavior with ext4: chmod on the
> >> | magic symlink fails with EOPNOTSUPP as in Florian's test, but fchmod
> >> | on the O_PATH fd succeeds and changes the symlink mode. This is with
> >> | 5.4. Cany anyone else confirm this? Is it a problem?
> >>
> >> It looks broken to me because fchmod (as an inode-changing operation)
> >> is not supposed to work on O_PATH descriptors.
> >
> > Why?  O_PATH does have an associated inode just fine; where does
> > that "not supposed to" come from?
>
> It fails on most file systems right now.  I thought that was expected.
> Other system calls (fsetxattr IIRC) do not work on O_PATH descriptors,
> either.  I assumed that an O_PATH descriptor was not intending to
> confer that capability.  Even openat fails.
>
> Although fchmod does succeed on read-only descriptors, which is a bit
> strange.

That's intentional and as-specified (and matches all historical
practice):

  "The fchmod() function shall be equivalent to chmod() except that
  the file whose permissions are changed is specified by the file
  descriptor fildes."

And chmod is specified as:

  "The application shall ensure that the effective user ID of the
  process matches the owner of the file or the process has appropriate
  privileges in order to do this."

No alternate behavior regarding permissions is specified; ability to
operate on the file does not depend on the open file mode.

Rich
Reply | Threaded
Open this post in threaded view
|

Re: XFS reports lchmod failure, but changes file system contents

Rich Felker-2
In reply to this post by Andreas Schwab-2
On Wed, Feb 12, 2020 at 09:17:41PM +0100, Andreas Schwab wrote:

> On Feb 12 2020, Florian Weimer wrote:
>
> > * Al Viro:
> >
> >> On Wed, Feb 12, 2020 at 08:15:08PM +0100, Florian Weimer wrote:
> >>
> >>> | Further, I've found some inconsistent behavior with ext4: chmod on the
> >>> | magic symlink fails with EOPNOTSUPP as in Florian's test, but fchmod
> >>> | on the O_PATH fd succeeds and changes the symlink mode. This is with
> >>> | 5.4. Cany anyone else confirm this? Is it a problem?
> >>>
> >>> It looks broken to me because fchmod (as an inode-changing operation)
> >>> is not supposed to work on O_PATH descriptors.
> >>
> >> Why?  O_PATH does have an associated inode just fine; where does
> >> that "not supposed to" come from?
> >
> > It fails on most file systems right now.  I thought that was expected.
> > Other system calls (fsetxattr IIRC) do not work on O_PATH descriptors,
> > either.  I assumed that an O_PATH descriptor was not intending to
> > confer that capability.  Even openat fails.
>
> According to open(2), this is expected:
>
>        O_PATH (since Linux 2.6.39)
>               Obtain a file descriptor that can be used for two  purposes:  to
>               indicate a location in the filesystem tree and to perform opera-
>               tions that act purely at the file descriptor  level.   The  file
>               itself  is not opened, and other file operations (e.g., read(2),
>               write(2), fchmod(2), fchown(2), fgetxattr(2), ioctl(2), mmap(2))
>               fail with the error EBADF.

That text is outdated and should be corrected. Fixing fchmod fchown,
fstat, etc. to operate on O_PATH file descriptors was a very
intentional change in the kernel.

Rich
Reply | Threaded
Open this post in threaded view
|

Re: XFS reports lchmod failure, but changes file system contents

Florian Weimer
* Rich Felker:

> On Wed, Feb 12, 2020 at 09:17:41PM +0100, Andreas Schwab wrote:
>> On Feb 12 2020, Florian Weimer wrote:
>>
>> > * Al Viro:
>> >
>> >> On Wed, Feb 12, 2020 at 08:15:08PM +0100, Florian Weimer wrote:
>> >>
>> >>> | Further, I've found some inconsistent behavior with ext4: chmod on the
>> >>> | magic symlink fails with EOPNOTSUPP as in Florian's test, but fchmod
>> >>> | on the O_PATH fd succeeds and changes the symlink mode. This is with
>> >>> | 5.4. Cany anyone else confirm this? Is it a problem?
>> >>>
>> >>> It looks broken to me because fchmod (as an inode-changing operation)
>> >>> is not supposed to work on O_PATH descriptors.
>> >>
>> >> Why?  O_PATH does have an associated inode just fine; where does
>> >> that "not supposed to" come from?
>> >
>> > It fails on most file systems right now.  I thought that was expected.
>> > Other system calls (fsetxattr IIRC) do not work on O_PATH descriptors,
>> > either.  I assumed that an O_PATH descriptor was not intending to
>> > confer that capability.  Even openat fails.
>>
>> According to open(2), this is expected:
>>
>>        O_PATH (since Linux 2.6.39)
>>               Obtain a file descriptor that can be used for two  purposes:  to
>>               indicate a location in the filesystem tree and to perform opera-
>>               tions that act purely at the file descriptor  level.   The  file
>>               itself  is not opened, and other file operations (e.g., read(2),
>>               write(2), fchmod(2), fchown(2), fgetxattr(2), ioctl(2), mmap(2))
>>               fail with the error EBADF.
>
> That text is outdated and should be corrected. Fixing fchmod fchown,
> fstat, etc. to operate on O_PATH file descriptors was a very
> intentional change in the kernel.

I suppose we could do the S_ISLNK check, try fchmod, and if that
fails, go via /proc.  Is this the direction you want to go in?
Reply | Threaded
Open this post in threaded view
|

Re: XFS reports lchmod failure, but changes file system contents

Al Viro-4
In reply to this post by Rich Felker-2
On Wed, Feb 12, 2020 at 03:19:51PM -0500, Rich Felker wrote:

> On Wed, Feb 12, 2020 at 09:17:41PM +0100, Andreas Schwab wrote:
> > On Feb 12 2020, Florian Weimer wrote:
> >
> > > * Al Viro:
> > >
> > >> On Wed, Feb 12, 2020 at 08:15:08PM +0100, Florian Weimer wrote:
> > >>
> > >>> | Further, I've found some inconsistent behavior with ext4: chmod on the
> > >>> | magic symlink fails with EOPNOTSUPP as in Florian's test, but fchmod
> > >>> | on the O_PATH fd succeeds and changes the symlink mode. This is with
> > >>> | 5.4. Cany anyone else confirm this? Is it a problem?
> > >>>
> > >>> It looks broken to me because fchmod (as an inode-changing operation)
> > >>> is not supposed to work on O_PATH descriptors.
> > >>
> > >> Why?  O_PATH does have an associated inode just fine; where does
> > >> that "not supposed to" come from?
> > >
> > > It fails on most file systems right now.  I thought that was expected.
> > > Other system calls (fsetxattr IIRC) do not work on O_PATH descriptors,
> > > either.  I assumed that an O_PATH descriptor was not intending to
> > > confer that capability.  Even openat fails.
> >
> > According to open(2), this is expected:
> >
> >        O_PATH (since Linux 2.6.39)
> >               Obtain a file descriptor that can be used for two  purposes:  to
> >               indicate a location in the filesystem tree and to perform opera-
> >               tions that act purely at the file descriptor  level.   The  file
> >               itself  is not opened, and other file operations (e.g., read(2),
> >               write(2), fchmod(2), fchown(2), fgetxattr(2), ioctl(2), mmap(2))
> >               fail with the error EBADF.
>
> That text is outdated and should be corrected. Fixing fchmod fchown,
> fstat, etc. to operate on O_PATH file descriptors was a very
> intentional change in the kernel.

Wait.  First of all, in the testcase it's chmod(2) applied to /proc/*/fd/*; that's
no different for O_PATH descriptors.  Location in the tree *is* associated with
O_PATH fd; that's the only thing they exist for.

fchmod(2) will certainly fail for those, as it always had:
int ksys_fchmod(unsigned int fd, umode_t mode)
{
        struct fd f = fdget(fd);
        int err = -EBADF;

        if (f.file) {
                audit_file(f.file);
                err = chmod_common(&f.file->f_path, mode);
                fdput(f);  
        }
        return err;
}

SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode)
{
        return ksys_fchmod(fd, mode);
}

... and that fdget() will give you -EBADF.  If you've managed to get
fchmod(2) the syscall to give you anything other than that, I want
to see details.
Reply | Threaded
Open this post in threaded view
|

Re: XFS reports lchmod failure, but changes file system contents

Rich Felker-2
On Wed, Feb 12, 2020 at 08:27:24PM +0000, Al Viro wrote:

> On Wed, Feb 12, 2020 at 03:19:51PM -0500, Rich Felker wrote:
> > On Wed, Feb 12, 2020 at 09:17:41PM +0100, Andreas Schwab wrote:
> > > On Feb 12 2020, Florian Weimer wrote:
> > >
> > > > * Al Viro:
> > > >
> > > >> On Wed, Feb 12, 2020 at 08:15:08PM +0100, Florian Weimer wrote:
> > > >>
> > > >>> | Further, I've found some inconsistent behavior with ext4: chmod on the
> > > >>> | magic symlink fails with EOPNOTSUPP as in Florian's test, but fchmod
> > > >>> | on the O_PATH fd succeeds and changes the symlink mode. This is with
> > > >>> | 5.4. Cany anyone else confirm this? Is it a problem?
> > > >>>
> > > >>> It looks broken to me because fchmod (as an inode-changing operation)
> > > >>> is not supposed to work on O_PATH descriptors.
> > > >>
> > > >> Why?  O_PATH does have an associated inode just fine; where does
> > > >> that "not supposed to" come from?
> > > >
> > > > It fails on most file systems right now.  I thought that was expected.
> > > > Other system calls (fsetxattr IIRC) do not work on O_PATH descriptors,
> > > > either.  I assumed that an O_PATH descriptor was not intending to
> > > > confer that capability.  Even openat fails.
> > >
> > > According to open(2), this is expected:
> > >
> > >        O_PATH (since Linux 2.6.39)
> > >               Obtain a file descriptor that can be used for two  purposes:  to
> > >               indicate a location in the filesystem tree and to perform opera-
> > >               tions that act purely at the file descriptor  level.   The  file
> > >               itself  is not opened, and other file operations (e.g., read(2),
> > >               write(2), fchmod(2), fchown(2), fgetxattr(2), ioctl(2), mmap(2))
> > >               fail with the error EBADF.
> >
> > That text is outdated and should be corrected. Fixing fchmod fchown,
> > fstat, etc. to operate on O_PATH file descriptors was a very
> > intentional change in the kernel.
>
> Wait.  First of all, in the testcase it's chmod(2) applied to /proc/*/fd/*; that's
> no different for O_PATH descriptors.  Location in the tree *is* associated with
> O_PATH fd; that's the only thing they exist for.
>
> fchmod(2) will certainly fail for those, as it always had:
> int ksys_fchmod(unsigned int fd, umode_t mode)
> {
>         struct fd f = fdget(fd);
>         int err = -EBADF;
>
>         if (f.file) {
>                 audit_file(f.file);
>                 err = chmod_common(&f.file->f_path, mode);
>                 fdput(f);  
>         }
>         return err;
> }
>
> SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode)
> {
>         return ksys_fchmod(fd, mode);
> }
>
> .... and that fdget() will give you -EBADF.  If you've managed to get
> fchmod(2) the syscall to give you anything other than that, I want
> to see details.

Sorry, it's my fault -- that's not the raw fchmod syscall but the
fchmod function, which falls back to using /proc on failure with EBADF
because this is necessary to support O_SEARCH/O_EXEC functionality
implemented through O_PATH file descriptors.

So the same thing is happening regardless of whether /proc is used
because /proc is the backend either way.

However, what I have found is that the same bug present on XFS is also
present on ext4. After:

chmod("/proc/self/fd/3", 0777)          = -1 EOPNOTSUPP (Not supported)

$ ls -l symlink
lrwxrwxrwx    1 dalias   users            3 Feb 12 13:48 symlink -> DNE

and after:

chmod("/proc/self/fd/3", 000)           = -1 EOPNOTSUPP (Not supported)

l---------    1 dalias   users            3 Feb 12 13:48 symlink -> DNE

So perhaps this is happening at a higher level in the kernel.

Apologies for the noise from confusing function call with syscall.

Rich
Reply | Threaded
Open this post in threaded view
|

Re: XFS reports lchmod failure, but changes file system contents

Paul Eggert
In reply to this post by Florian Weimer
On 2/12/20 12:01 PM, Florian Weimer wrote:
> I assumed that an O_PATH descriptor was not intending to
> confer that capability.

I originally assumed the other way, as I don't see any security reason
why fchmod should not work on O_PATH-opened descriptors. I see that the
Linux man page says open+O_PATH doesn't work with fchmod, but that's
just a bug in the spec.

In Android, the bionic C library has worked around this problem since
2015 by wrapping fchmod so that it works even when the fd was
O_PATH-opened. Bionic then uses O_PATH + fchmod to work around the
fchmodat+AT_SYMLINK_NOFOLLOW problem[1]. glibc (and Gnulib, etc.) could
do the same. It's the most sane way out of this mess.

[1]
https://android.googlesource.com/platform/bionic/+/3cbc6c627fe57c9a9783c52d148078f8d52f7b96
Reply | Threaded
Open this post in threaded view
|

Re: XFS reports lchmod failure, but changes file system contents

Rich Felker-2
In reply to this post by Florian Weimer
On Wed, Feb 12, 2020 at 09:26:11PM +0100, Florian Weimer wrote:

> * Rich Felker:
>
> > On Wed, Feb 12, 2020 at 09:17:41PM +0100, Andreas Schwab wrote:
> >> On Feb 12 2020, Florian Weimer wrote:
> >>
> >> > * Al Viro:
> >> >
> >> >> On Wed, Feb 12, 2020 at 08:15:08PM +0100, Florian Weimer wrote:
> >> >>
> >> >>> | Further, I've found some inconsistent behavior with ext4: chmod on the
> >> >>> | magic symlink fails with EOPNOTSUPP as in Florian's test, but fchmod
> >> >>> | on the O_PATH fd succeeds and changes the symlink mode. This is with
> >> >>> | 5.4. Cany anyone else confirm this? Is it a problem?
> >> >>>
> >> >>> It looks broken to me because fchmod (as an inode-changing operation)
> >> >>> is not supposed to work on O_PATH descriptors.
> >> >>
> >> >> Why?  O_PATH does have an associated inode just fine; where does
> >> >> that "not supposed to" come from?
> >> >
> >> > It fails on most file systems right now.  I thought that was expected.
> >> > Other system calls (fsetxattr IIRC) do not work on O_PATH descriptors,
> >> > either.  I assumed that an O_PATH descriptor was not intending to
> >> > confer that capability.  Even openat fails.
> >>
> >> According to open(2), this is expected:
> >>
> >>        O_PATH (since Linux 2.6.39)
> >>               Obtain a file descriptor that can be used for two  purposes:  to
> >>               indicate a location in the filesystem tree and to perform opera-
> >>               tions that act purely at the file descriptor  level.   The  file
> >>               itself  is not opened, and other file operations (e.g., read(2),
> >>               write(2), fchmod(2), fchown(2), fgetxattr(2), ioctl(2), mmap(2))
> >>               fail with the error EBADF.
> >
> > That text is outdated and should be corrected. Fixing fchmod fchown,
> > fstat, etc. to operate on O_PATH file descriptors was a very
> > intentional change in the kernel.
>
> I suppose we could do the S_ISLNK check, try fchmod, and if that
> fails, go via /proc.  Is this the direction you want to go in?

It was, but Al Viro just pointed out to me that I was wrong. I think
we could use fstat (which AIUI now works) to do the S_ISLNK check, so
that it doesn't depend on /proc, but I don't see a way to do the chmod
operation without /proc at this time.

Rich
Reply | Threaded
Open this post in threaded view
|

Re: XFS reports lchmod failure, but changes file system contents

Aleksa Sarai
In reply to this post by Florian Weimer
On 2020-02-12, Florian Weimer <[hidden email]> wrote:

> * Al Viro:
>
> > On Wed, Feb 12, 2020 at 08:15:08PM +0100, Florian Weimer wrote:
> >
> >> | Further, I've found some inconsistent behavior with ext4: chmod on the
> >> | magic symlink fails with EOPNOTSUPP as in Florian's test, but fchmod
> >> | on the O_PATH fd succeeds and changes the symlink mode. This is with
> >> | 5.4. Cany anyone else confirm this? Is it a problem?
> >>
> >> It looks broken to me because fchmod (as an inode-changing operation)
> >> is not supposed to work on O_PATH descriptors.
> >
> > Why?  O_PATH does have an associated inode just fine; where does
> > that "not supposed to" come from?
>
> It fails on most file systems right now.  I thought that was expected.
> Other system calls (fsetxattr IIRC) do not work on O_PATH descriptors,
> either.  I assumed that an O_PATH descriptor was not intending to
> confer that capability.  Even openat fails.
openat(2) failing on an O_PATH for a symlink is separate to fchmod(2)
failing:

 * fchmod(2) fails with EBADF because O_PATH file descriptors have
   f->f_ops set to empty_fops -- this is why ioctl(2)s also fail on
   O_PATH file descriptors. This is *intentional* behaviour.

   My understanding of the original idea of O_PATH file descriptors is
   that they are meant to have restricted capabilities -- it's
   effectively a "half-open" file handle. The fact that some folks (like
   myself) figured out you can do all sorts of crazy stuff with them is
   mostly an accident.

 * openat(n, ...) fails with ENOTDIR because openat(2) requires the
   argument to be a directory, and O_PATH-of-a-symlink-to-a-directory
   doesn't count (path_init doesn't do resolution of the dirfd
   argument -- nor should it IMHO).

 * open(/proc/self/fd/$n) failing with ELOOP might actually be a bug
   (the error is coming from may_open as though the lookup was done with
   O_NOFOLLOW) -- the nd_jump_link() jump takes the namei lookup to a
   the symlink but it looks like the normal link_path_walk et al
   handling doesn't actually try to continue resolving it. I'll look
   into this a bit more.

--
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

signature.asc (235 bytes) Download Attachment
12