[patch] Avoid the use of mmap as buffer for stdio streams

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[patch] Avoid the use of mmap as buffer for stdio streams

Arjan van de Ven
Hi,

currently glibc uses mmap() to allocate the (anonymous) buffers that
back a stdio stream. This is non-optimal in several ways:
* mmap and munmap are very expensive operations in a threaded/multi
   cpu envinvironment: the kernel needs to take "big" locks and also
   needs to do a cross-cpu TLB invalidate on every munmap.
* even though the standard buffer is 1Kb, mmap() causes this to take
   4Kb of memory and TLB space; this is wasting 300% !
* there is no recycling of the memory going on on frequent
   fopen/fclose cycles, each cycle leads to the expensive system calls
   without any sharing; sharing is good for avoiding work,
   tlb pressure, cpu cache utilization etc.
* unlike malloc()/free(), mmap() and munmap() have very little sanity
   checking and bugs in the implementation or use get hidden most of
   the time but can lead to sporadic and hard to debug corruption or
   other weird effects [*]

This is caused by ALLOC_BUF and FREE_BUF to be hardcoded to
mmap/munmap in libio/libioP.h in case the operating system has mmap;
in the same file an malloc/free based implementation is available as
well for the non-mmap case. I suspect this came from a time where mmap
was thought to be much faster than malloc.... which isn't the case on
nowadays glibc/Linux.

The attached patch just removes the mmap based implementation, and
causes glibc to always use the malloc/free based implementation.

This has the advantage of solving all the downsides of mmap/munmap,
but has 2 potential disadvantages of it's own:
1) it's now no longer a good idea to use fopen and friends inside the
malloc() implementation (that doesn't seem to happen anyway)

[*] 2) stricter error checking in free() compared to malloc will
uncover bugs. For example setvbuf() is known to cause a "junk" pointer
to be passed to FREE_BUF(), with this patch this will cause a program
abort, rather than the existing "call munmap() on a junk pointer and
watch the kernel give -EINVAL" behavior.
( https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=217064 )

I feel that these potential disadvantages outweigh the advantages and
so I'd like to ask for the patch to be applied (possibly after the
setvbuf bug is fixed first)

Greetings,
    Arjan van de Ven

Index: glibc-20061120T1000/libio/libioP.h
===================================================================
--- glibc-20061120T1000.orig/libio/libioP.h
+++ glibc-20061120T1000/libio/libioP.h
@@ -825,26 +825,7 @@ extern void _IO_un_link_internal (struct
 # define ROUND_TO_PAGE(_S) \
        (((_S) + EXEC_PAGESIZE - 1) & ~(EXEC_PAGESIZE - 1))
 
-# define FREE_BUF(_B, _S) \
-       munmap ((_B), ROUND_TO_PAGE (_S))
-# define ALLOC_BUF(_B, _S, _R) \
-       do {      \
-  (_B) = (char *) mmap (0, ROUND_TO_PAGE (_S),      \
- PROT_READ | PROT_WRITE,      \
- MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);      \
-  if ((_B) == (char *) MAP_FAILED)      \
-    return (_R);      \
-       } while (0)
-# define ALLOC_WBUF(_B, _S, _R) \
-       do {      \
-  (_B) = (wchar_t *) mmap (0, ROUND_TO_PAGE (_S),      \
-   PROT_READ | PROT_WRITE,      \
-   MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);      \
-  if ((_B) == (wchar_t *) MAP_FAILED)      \
-    return (_R);      \
-       } while (0)
-
-#else /* _G_HAVE_MMAP */
+#endif /* _G_HAVE_MMAP */
 
 # define FREE_BUF(_B, _S) \
        free(_B)
@@ -861,7 +842,6 @@ extern void _IO_un_link_internal (struct
     return (_R);      \
        } while (0)
 
-#endif /* _G_HAVE_MMAP */
 
 #ifndef OS_FSTAT
 # define OS_FSTAT fstat
Reply | Threaded
Open this post in threaded view
|

Re: [patch] Avoid the use of mmap as buffer for stdio streams

Wolfram Gloger
Hi,

> 1) it's now no longer a good idea to use fopen and friends inside the
> malloc() implementation (that doesn't seem to happen anyway)

That cannot happen, not even in a custom malloc implementation,
because fopen already calls malloc for its FILE object..

However, I just re-read the discussion from December 1996 (wow,
we're approaching 10 years of glibc, where's the party BTW?).

The reason to switching to mmap() for file buffers was the fact that a
zeroed buffer is required.  Before the switch to mmap, calloc() was
used and _this_ was not present in all custom malloc implementations,
notably bash's at the time.

Although I think "custom mallocs not providing calloc()" is by now
probably not a very strong argument, the fact that zeroed memory is
required somewhat weakens the disadvantages of using mmap.

Regards,
Wolfram.
Reply | Threaded
Open this post in threaded view
|

Re: [patch] Avoid the use of mmap as buffer for stdio streams

Arjan van de Ven
Wolfram Gloger wrote:
> Although I think "custom mallocs not providing calloc()" is by now
> probably not a very strong argument, the fact that zeroed memory is
> required somewhat weakens the disadvantages of using mmap.
>


actually... mmap means the kernel needs to zero 4Kb of memory (and it
does)... even though the buffer is only 1Kb by default... so that's
3Kb extra zeroed for no reason ...

(this is part of why mmap is so expensive)
Reply | Threaded
Open this post in threaded view
|

Re: [patch] Avoid the use of mmap as buffer for stdio streams

Ulrich Drepper
In reply to this post by Arjan van de Ven
What code is _really_ affected by this?  We are using mmap for a number
of reasons, one of which is that it reduces fragmentation on the heap.
It works well and so far I don't see any reason to change this.  I've
not seen programs which constantly and with high frequency fclose() files.

--
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
Reply | Threaded
Open this post in threaded view
|

Re: [patch] Avoid the use of mmap as buffer for stdio streams

Arjan van de Ven
Ulrich Drepper wrote:
> What code is _really_ affected by this?

I noticed it while stracing the new version of irqbalance that I'm
developing. That needs to fopen/read/fclose 3 virtual (proc/sysfs)
files every iteration; it does the munmap 3 times. I know that a whole
lot of the desktop stack (hal, gnome-power-manager etc) do this as
well, adding up to a few dozen such things every second.

Many of the "traditional" unix tools like tar and grep don't use stdio
so those don't count; that's one point for your argument.

A case that I found that does count in my favor is nautilus; it uses
stdio for everything and does a lot of fopen .. read some .. fclose
sequences when going into a new directory
(I suspect for reading the first bits of each file to determine what
icon to use)


> We are using mmap for a number
> of reasons, one of which is that it reduces fragmentation on the heap.

... by trading virtual address space fragmentation some ;)

  > It works well and so far I don't see any reason to change this.  I've
> not seen programs which constantly and with high frequency fclose() files.
I bet if you look at java ...

anyway, one of the things against the mmap implementation is that it
allocates 4K for a buffer it uses only 1Kb of. Would you at least
consider changing the default buffer to be PAGE_SIZE if you keep
insisting on using mmap? To avoid the wasting of this 75% of the
memory that's entirely not used....