[Bug dynamic-link/15310] New: _dl_sort_fini is O(n^3) causing slow exit when many dsos

classic Classic list List threaded Threaded
40 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[Bug dynamic-link/15310] New: _dl_sort_fini is O(n^3) causing slow exit when many dsos

giuliomoro at yahoo dot it
http://sourceware.org/bugzilla/show_bug.cgi?id=15310

             Bug #: 15310
           Summary: _dl_sort_fini is O(n^3) causing slow exit when many
                    dsos
           Product: glibc
           Version: unspecified
            Status: NEW
          Severity: critical
          Priority: P2
         Component: dynamic-link
        AssignedTo: [hidden email]
        ReportedBy: [hidden email]
    Classification: Unclassified


The fix for Bug 13882 ("Cycle detection in dynamic loader is broken ")
fixed premature termination of the inner loop of _dl_sort_fini,
making the function (closer to) correct,
but it also changes this function's runtime from O(n^2) to O(n^3)
where n is the number of items (resident DSOs) to be sorted.
(The same is true for the corresponding init sorts in dl-open.c and dl-deps.c.)

This can be readily seen by looking at Jeff Law's example
in the description of bug 13882 (a linear chain of dependencies
that _dl_sort_fini needs to completely reverse).
In this case, each of O(n) objects gets moved O(n) times;
furthermore the analysis leading to each such move
(as well as the move itself) takes O(n) time.
That's O(n)*O(n)*O(n) = O(n^3).

Another easy way to get O(n^3) behavior
is with cycles: any node that's part of a nontrivial cycle
is guaranteed to keep getting moved repeatedly until its moved-too-many-times
counter expires, which is O(n) times (for O(n) of the items anyway).
So for example, if the dependency graph consists
of mutually dependent pairs of DSOs:
    A<->B  C<->D  E<->F ...
that will result in O(n^3) run time as well.

We observed the O(n^3) behavior in real life, in our application that had 575
DSOs
upon exit-- in RHEL5.3 (glibc 2.5), it took less than 1 second to exit;
upon upgrading to RHEL6.3 (glibc 2.12), the same app took 15 seconds to exit.
Instrumenting _dl_sort_fini (i.e. putting a counter in it
and printing it at the end) revealed that the innermost loop body
was entered more than 1.7 billion times,
roughly confirming the O(n^3) claim in practice.

This is just a topsort, which can be done simply in O(n) time
with no fancy data structures.

--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug dynamic-link/15310] _dl_sort_fini is O(n^3) causing slow exit when many dsos

giuliomoro at yahoo dot it
http://sourceware.org/bugzilla/show_bug.cgi?id=15310

--- Comment #1 from Don Hatch <dhatch at ilm dot com> 2013-03-27 08:12:18 UTC ---
Created attachment 6951
  --> http://sourceware.org/bugzilla/attachment.cgi?id=6951
Makefile that constructs the linear chain test case demonstrating the O(n^3)
behavior.

Attaching a Makefile that quickly constructs the linear dependency chain
test case, demonstrating the O(n^3) behavior.

See the comment at the top of this file for a description
of exactly what it does and how to use it.

--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug dynamic-link/15310] _dl_sort_fini is O(n^3) causing slow exit when many dsos

giuliomoro at yahoo dot it
In reply to this post by giuliomoro at yahoo dot it
http://sourceware.org/bugzilla/show_bug.cgi?id=15310

Don Hatch <dhatch at ilm dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |law at redhat dot com

--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug dynamic-link/15310] _dl_sort_fini is O(n^3) causing slow exit when many dsos

giuliomoro at yahoo dot it
In reply to this post by giuliomoro at yahoo dot it
http://sourceware.org/bugzilla/show_bug.cgi?id=15310

Carlos O'Donell <carlos at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |carlos at redhat dot com

--- Comment #2 from Carlos O'Donell <carlos at redhat dot com> 2013-03-27 12:57:37 UTC ---
Don,

I agree that the sorting could be made *far* faster.

Thanks for submitting this. We were well aware that the minimal fix for bug
13882 would cause some kind of performance regression, but it was a balance
between a minimal fix and low risk of breakage. I reviewed the patch for 13882
and even build a minimal framework for testing that dynamic loader function
outside of the build.

Do you have the time to investigate this and propose a patch (requires
copyright assignment)?

--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug dynamic-link/15310] _dl_sort_fini is O(n^3) causing slow exit when many dsos

giuliomoro at yahoo dot it
In reply to this post by giuliomoro at yahoo dot it
http://sourceware.org/bugzilla/show_bug.cgi?id=15310

Paul Pluzhnikov <ppluzhnikov at google dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ppluzhnikov at google dot
                   |                            |com

--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug dynamic-link/15310] _dl_sort_fini is O(n^3) causing slow exit when many dsos

giuliomoro at yahoo dot it
In reply to this post by giuliomoro at yahoo dot it
http://sourceware.org/bugzilla/show_bug.cgi?id=15310

--- Comment #3 from Don Hatch <dhatch at ilm dot com> 2013-03-27 20:33:23 UTC ---
(In reply to comment #2)

> Don,
>
> I agree that the sorting could be made *far* faster.
>
> Thanks for submitting this. We were well aware that the minimal fix for bug
> 13882 would cause some kind of performance regression, but it was a balance
> between a minimal fix and low risk of breakage. I reviewed the patch for 13882
> and even build a minimal framework for testing that dynamic loader function
> outside of the build.
>
> Do you have the time to investigate this and propose a patch (requires
> copyright assignment)?

I do. I am working on a patch that resolves both this and bug 15311,
and I'll submit it here in a day or two.

I am very interested in what you came up with in the way of a unit
testing scheme for this function... I could certainly use it.
I've found it frustrating that the existing tests run by "make check"
(the ones I saw anyway)
involve just creating/compiling/running a handful of real programs...
to really stress test an implementation of _dl_sort_fini properly,
I'd want to (at least) enumerate all possible graphs
of up to 3 or 4 nodes, and call it on each of them,
which would be millions of examples...
and a few million randomly generated larger examples as well.
It's *really* easy to get this stuff wrong otherwise.

Also I'd like to start by moving the init sorting code into a function.
It looks to me like this code is duplicated in two places (dl-open.c
and dl-deps.c), and (after the fix for bug 15309)
it's identical in both places except that
one of them starts at i0=0 and the other starts at i0=1.
So this could be expressed cleanly as a new function _dl_sort_init that takes
i0
as a parameter.
Should I start by submitting a patch that does that,
with no functional change, and go from there?  Or should I let you
or someone else do this refactoring (possibly in conjunction
with making these sorting functions unit testable)?
Let me know how to proceed.

--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

Re: [Bug dynamic-link/15310] New: _dl_sort_fini is O(n^3) causing slow exit when many dsos

Ondřej Bílka
In reply to this post by giuliomoro at yahoo dot it
On Wed, Mar 27, 2013 at 07:47:46AM +0000, dhatch at ilm dot com wrote:
> http://sourceware.org/bugzilla/show_bug.cgi?id=15310
>
> This is just a topsort, which can be done simply in O(n) time
> with no fancy data structures.
>
I do not have domain knowledge to understand what is necessary to
consider.

However I could write topsort if I knew dependencies. From code I think
following pseudocode should work upto possibility that some arrows need
be reversed.

add_edge(g,x,y) // add edge from x to y
topsort(g)      // returns topologic ordering of g

graph *g;
for(k=0;k<nmaps;k++)
  {
      /* Add dependencies of the object.  */
      struct link_map **runp = maps[k]->l_initfini;
      if (runp != NULL)
        while (*runp != NULL)
          add_edge(g,maps[k],*runp);
               
  /* Add relocation dependencies.  */
      if (__builtin_expect (maps[k]->l_reldeps != NULL, 0))
        {
          unsigned int m = maps[k]->l_reldeps->act;
          struct link_map **relmaps = &maps[k]->l_reldeps->list[0];
          for (j=0;j<m;j++)
            add_edge(g,relmaps[k],*runp)
        }
  }

        memcpy(maps,topsort(g));
Reply | Threaded
Open this post in threaded view
|

[Bug dynamic-link/15310] _dl_sort_fini is O(n^3) causing slow exit when many dsos

giuliomoro at yahoo dot it
In reply to this post by giuliomoro at yahoo dot it
http://sourceware.org/bugzilla/show_bug.cgi?id=15310

--- Comment #4 from Ondrej Bilka <neleai at seznam dot cz> 2013-03-27 20:50:40 UTC ---
On Wed, Mar 27, 2013 at 07:47:46AM +0000, dhatch at ilm dot com wrote:
> http://sourceware.org/bugzilla/show_bug.cgi?id=15310
>
> This is just a topsort, which can be done simply in O(n) time
> with no fancy data structures.
>
I do not have domain knowledge to understand what is necessary to
consider.

However I could write topsort if I knew dependencies. From code I think
following pseudocode should work upto possibility that some arrows need
be reversed.

add_edge(g,x,y) // add edge from x to y
topsort(g)      // returns topologic ordering of g

graph *g;
for(k=0;k<nmaps;k++)
  {
      /* Add dependencies of the object.  */
      struct link_map **runp = maps[k]->l_initfini;
      if (runp != NULL)
        while (*runp != NULL)
          add_edge(g,maps[k],*runp);

          /* Add relocation dependencies.  */
      if (__builtin_expect (maps[k]->l_reldeps != NULL, 0))
        {
          unsigned int m = maps[k]->l_reldeps->act;
          struct link_map **relmaps = &maps[k]->l_reldeps->list[0];
          for (j=0;j<m;j++)
            add_edge(g,relmaps[k],*runp)
        }
  }

    memcpy(maps,topsort(g));

--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug dynamic-link/15310] _dl_sort_fini is O(n^3) causing slow exit when many dsos

giuliomoro at yahoo dot it
In reply to this post by giuliomoro at yahoo dot it
http://sourceware.org/bugzilla/show_bug.cgi?id=15310

--- Comment #5 from Carlos O'Donell <carlos at redhat dot com> 2013-03-27 21:00:34 UTC ---
(In reply to comment #3)

> (In reply to comment #2)
> > Don,
> >
> > I agree that the sorting could be made *far* faster.
> >
> > Thanks for submitting this. We were well aware that the minimal fix for bug
> > 13882 would cause some kind of performance regression, but it was a balance
> > between a minimal fix and low risk of breakage. I reviewed the patch for 13882
> > and even build a minimal framework for testing that dynamic loader function
> > outside of the build.
> >
> > Do you have the time to investigate this and propose a patch (requires
> > copyright assignment)?
>
> I do. I am working on a patch that resolves both this and bug 15311,
> and I'll submit it here in a day or two.

I look forward to the patch.

> I am very interested in what you came up with in the way of a unit
> testing scheme for this function... I could certainly use it.

I aggrandized a bit here. I copied the relevant sorting code out of the loader
code, wrapped it up in a function, created some static arrays to simulate DSOs
loaded in a certain order, and then ran the sorting function against the the
simulated DSO list. The best description is that I ran a simulation. I looked
for the code I used to test bug 13882, but I've lost it (changed employers).

You can see my comments here:
http://sourceware.org/ml/libc-alpha/2012-06/msg00560.html

> I've found it frustrating that the existing tests run by "make check"
> (the ones I saw anyway)
> involve just creating/compiling/running a handful of real programs...
> to really stress test an implementation of _dl_sort_fini properly,
> I'd want to (at least) enumerate all possible graphs
> of up to 3 or 4 nodes, and call it on each of them,
> which would be millions of examples...
> and a few million randomly generated larger examples as well.
> It's *really* easy to get this stuff wrong otherwise.

I fully agree.

As a volunteer project we live and die by the companies and individuals that
choose to contribute to the project. I would be more than happy to see all
possible 3 or 4 node graphs tested. The random testing is more problematic as
you are probably well aware of; you can still auto-generate millions of test
cases just make it deterministic :-)

> Also I'd like to start by moving the init sorting code into a function.
> It looks to me like this code is duplicated in two places (dl-open.c
> and dl-deps.c), and (after the fix for bug 15309)
> it's identical in both places except that
> one of them starts at i0=0 and the other starts at i0=1.
> So this could be expressed cleanly as a new function _dl_sort_init that takes
> i0
> as a parameter.

That sounds like a great idea.

> Should I start by submitting a patch that does that,
> with no functional change, and go from there?  Or should I let you
> or someone else do this refactoring (possibly in conjunction
> with making these sorting functions unit testable)?
> Let me know how to proceed.

Always break the work into as small a piece as conceivably possible.

Doing just the refactoring is a great first step.

Once we have that in place we can talk about next steps.

The Contribution Checklist for this project is rather long, but we are a
conservative project and it helps to have everything documented and well
specified. You can see the checklist here:
http://sourceware.org/glibc/wiki/Contribution%20checklist

--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug dynamic-link/15310] _dl_sort_fini is O(n^3) causing slow exit when many dsos

giuliomoro at yahoo dot it
In reply to this post by giuliomoro at yahoo dot it
http://sourceware.org/bugzilla/show_bug.cgi?id=15310

Carlos O'Donell <carlos at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |WAITING

--- Comment #6 from Carlos O'Donell <carlos at redhat dot com> 2013-03-27 21:07:00 UTC ---
Don,

Do you have copyright papers in place with the FSF for glibc?

--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug dynamic-link/15310] _dl_sort_fini is O(n^3) causing slow exit when many dsos

giuliomoro at yahoo dot it
In reply to this post by giuliomoro at yahoo dot it
http://sourceware.org/bugzilla/show_bug.cgi?id=15310

--- Comment #7 from law at redhat dot com 2013-03-27 21:13:39 UTC ---
Carlos,

This is the tester QE uses for the correctness of the sort:  

Basically it takes as an input the number of DSOs you want to test.  Then
builds a random orderings and a pathological ordering of those DSOs on the link
line (the script knows the dependencies between DSOs).
deps.cpp:

#include <stdio.h>

#ifdef MAIN
int
main( int argc, char *argv[] )
{
  printf("main\n");
}
#else
static bool deps_init( void ) { puts( NAME ); }
bool x = deps_init();
#endif


driver:
#!/bin/bash
# based on reproducer from bug 11724

N=${1:-2}
TmpDir=$(mktemp -d)
cp deps.cpp $TmpDir
pushd $TmpDir > /dev/null 2>&1

for((all=0,k=2; k<=$N; ++all,++k)); do
    cmd="g++ -fPIC -shared -o libdeps${k}.so -DNAME=\\\"${k}\\\" deps.cpp"
    printf "# %s\n" "$cmd" > ${all}r.log
    eval "$cmd"
    for ((i=k-1; i>=1; --i)); do
        cmd="g++ -L. -fPIC -shared -o libdeps${i}.so -ldeps$((i+1))
-DNAME=\\\"${i}\\\" deps.cpp"
        printf "# %s\n" "$cmd" >> ${all}r.log
        eval "$cmd"
    done; unset i
    # random link order
    # it sucks that rhel5 coreutils don't have shuf or sort -R yet :)
    roptions=($(seq 1 $k | sed 's/^/ -ldeps/'))
    for ((i=(${#roptions[@]}-1); i >= 1; --i)); do
        random=$((RANDOM % i))
        m=${roptions[$random]}
        roptions[$random]=${roptions[$i]}
        roptions[$i]=$m
    done; unset i
    rcmd="g++ -Wl,-rpath -Wl,$(pwd) -L. -o rmain ${roptions[@]} -DMAIN
deps.cpp"
    # pathological link order
    poptions=($(seq $k -1 1 | sed 's/^/ -ldeps/'))
    pcmd="g++ -Wl,-rpath -Wl,$(pwd) -L. -o pmain ${poptions[@]} -DMAIN
deps.cpp"

    cp ${all}r.log ${all}p.log

    for i in r p; do
        cmd="${i}cmd"
        printf "# %s\n\n" "${!cmd}" >> ${all}${i}.log
        eval "${!cmd}"

        fail=0
        cp ${all}${i}.log ${i}out.golden
        ./${i}main >> ${all}${i}.log
        seq $k -1 1 >> ${i}out.golden
        echo main >> ${i}out.golden
        diff ${all}${i}.log ${i}out.golden > /dev/null 2>&1 || fail=1

        # it's ok, we don't care
        if [ 0 -eq $fail ]; then
            rm ${all}${i}.log
        fi
    done; unset i

    rm -rf *.so *main *out.golden *out.real
done; unset k

rm deps.cpp
test 0 -eq $(ls | wc -l) && { rm -rf $(pwd); echo PASS; } || { echo FAIL;
printf "results are in %s\n" $(pwd); }
popd > /dev/null 2>&1

--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug dynamic-link/15310] _dl_sort_fini is O(n^3) causing slow exit when many dsos

giuliomoro at yahoo dot it
In reply to this post by giuliomoro at yahoo dot it
http://sourceware.org/bugzilla/show_bug.cgi?id=15310

--- Comment #8 from Don Hatch <dhatch at ilm dot com> 2013-03-27 23:44:45 UTC ---
(In reply to comment #6)
> Do you have copyright papers in place with the FSF for glibc?
Working on it with my legal department.
My employer (ILM) is supportive so I don't anticipate a problem.

--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug dynamic-link/15310] _dl_sort_fini is O(n^3) causing slow exit when many dsos

giuliomoro at yahoo dot it
In reply to this post by giuliomoro at yahoo dot it
http://sourceware.org/bugzilla/show_bug.cgi?id=15310

--- Comment #9 from Don Hatch <dhatch at ilm dot com> 2013-03-28 00:31:40 UTC ---
(In reply to comment #4)

> On Wed, Mar 27, 2013 at 07:47:46AM +0000, dhatch at ilm dot com wrote:
> > http://sourceware.org/bugzilla/show_bug.cgi?id=15310
> >
> > This is just a topsort, which can be done simply in O(n) time
> > with no fancy data structures.
> >
> I do not have domain knowledge to understand what is necessary to
> consider.
>
> However I could write topsort if I knew dependencies. From code I think
> following pseudocode should work upto possibility that some arrows need
> be reversed.
>
> add_edge(g,x,y) // add edge from x to y
> topsort(g)      // returns topologic ordering of g
>
> graph *g;
> for(k=0;k<nmaps;k++)
>   {
>       /* Add dependencies of the object.  */
>       struct link_map **runp = maps[k]->l_initfini;
>       if (runp != NULL)
>         while (*runp != NULL)
>           add_edge(g,maps[k],*runp);
>
>           /* Add relocation dependencies.  */
>       if (__builtin_expect (maps[k]->l_reldeps != NULL, 0))
>         {
>           unsigned int m = maps[k]->l_reldeps->act;
>           struct link_map **relmaps = &maps[k]->l_reldeps->list[0];
>           for (j=0;j<m;j++)
>             add_edge(g,relmaps[k],*runp)
>         }
>   }
>
>     memcpy(maps,topsort(g));

That looks essentially right,
although there are a couple of nuances in the existing code.
    - under the comment "Do not handle ld.so in secondary namespace"
      (whatever that means), it ignores a dependency "B depends on A"
      if A!=A->l_real  (but doesn't check whether B!=B->l_real).
      I don't really know what this means, or whether this assymmetry
      is intentional...
      QUESTION: can I just omit all edges involving such items
      altogether, not caring where these items come out
      in the output order?
    - it also ignores "B depends on A" if A->l_idx == -1 (i.e.
IDX_STILL_USED)--
      this one I believe I understand--
      it definitely doesn't matter where such items
      come out in the final sort order.
    - static dependendies should override dynamic ones if there's a conflict.
      I'll do two passes in order to get this right, and more well-behaved
      than the existing code--
      see bug 15311 for details.

I hadn't thought about creating the graph structure explicitly as you
suggest...
arguably that would result in cleaner topsort code, but
I was just going to work directly with the data structure that's
already there instead.
Note that an explicit auxiliary graph representation would have size O(N+E)
(number of nodes plus number of edges)
whereas the size of needed auxiliary structures is currently only O(N)
(making it feasible to allocate them as local variables on the runtime stack)
and I wasn't planning on changing that.

The topsort algorithm will be essentially Tarjan's SCC algorithm:
http://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm
(actually two passes of it).  This is a bit more involved than
a simple reverse postorder, but keeping SCCs contiguous is desireable
(for one thing, the existing code does it-- furthermore,
it provides some nice well-behavedness between the two passes
that I wouldn't be able to prove otherwise).
Tarjan's algorithm is preferable to Kosaraju's since it doesn't require
explicitly creating the reverse graph.

One other issue/question-- I was assuming a recursive implementation
would be out of the question (due to increased runtime stack requirement);
but I'm finding that an iterative implementation of Tarjan's SCC algorithm
is... well, really unpleasant.
I can do it, but it's no fun, and it won't be any fun
to review or maintain it either.  A recursive implementation
would be MUCH cleaner, simpler, more readable and maintainable.
Any thoughts/feelings on whether it's okay to make it recursive?

--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

Re: [Bug dynamic-link/15310] _dl_sort_fini is O(n^3) causing slow exit when many dsos

Ondřej Bílka
On Thu, Mar 28, 2013 at 12:31:40AM +0000, dhatch at ilm dot com wrote:
 
 

> I hadn't thought about creating the graph structure explicitly as you
> suggest...
> arguably that would result in cleaner topsort code, but
> I was just going to work directly with the data structure that's
> already there instead.
> Note that an explicit auxiliary graph representation would have size O(N+E)
> (number of nodes plus number of edges)
> whereas the size of needed auxiliary structures is currently only O(N)
> (making it feasible to allocate them as local variables on the runtime stack)
> and I wasn't planning on changing that.
>
I mainly wanted do it to separate alg from domain specific bits.

Next one where I do not know if its necesarry or just optimization is

  /* We can skip looking for the binary itself which is at the front
     of the search list for the main namespace.  */

 

Reply | Threaded
Open this post in threaded view
|

[Bug dynamic-link/15310] _dl_sort_fini is O(n^3) causing slow exit when many dsos

giuliomoro at yahoo dot it
In reply to this post by giuliomoro at yahoo dot it
http://sourceware.org/bugzilla/show_bug.cgi?id=15310

--- Comment #10 from Ondrej Bilka <neleai at seznam dot cz> 2013-03-28 07:42:31 UTC ---
On Thu, Mar 28, 2013 at 12:31:40AM +0000, dhatch at ilm dot com wrote:


> I hadn't thought about creating the graph structure explicitly as you
> suggest...
> arguably that would result in cleaner topsort code, but
> I was just going to work directly with the data structure that's
> already there instead.
> Note that an explicit auxiliary graph representation would have size O(N+E)
> (number of nodes plus number of edges)
> whereas the size of needed auxiliary structures is currently only O(N)
> (making it feasible to allocate them as local variables on the runtime stack)
> and I wasn't planning on changing that.
>
I mainly wanted do it to separate alg from domain specific bits.

Next one where I do not know if its necesarry or just optimization is

  /* We can skip looking for the binary itself which is at the front
     of the search list for the main namespace.  */

--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug dynamic-link/15310] _dl_sort_fini is O(n^3) causing slow exit when many dsos

giuliomoro at yahoo dot it
In reply to this post by giuliomoro at yahoo dot it
http://sourceware.org/bugzilla/show_bug.cgi?id=15310

--- Comment #11 from Don Hatch <dhatch at ilm dot com> 2013-03-28 10:00:12 UTC ---
Created attachment 6955
  --> http://sourceware.org/bugzilla/attachment.cgi?id=6955
Proposed initial patch, rearranging init/fini sorting in prep for overhaul.
diffs from master at 7a86be6

Here's the proposed initial patch.
Commit message would be something like:
==========================================================================
Refactor ld.so init/fini sorting code in preparation for overhaul.

- Init sorting code was in two places: dl-open.c and dl-deps.c;
  it was identical in both places except for the starting index
  which was i0=0 in dl-open.c and i0=1 in dl-deps.c.
  Moved this common code out into a new function _dl_sort_init
  that takes i0 as a parameter.

- Moved _dl_sort_init and _dl_sort_fini out into two new files
  dl-sort-init.c and dl-sort-fini.c.
  This allows easy diffing between the two (which can be interesting)
  and may also help facilitate future efforts to unit test
  these two functions.

No functional change here.
==========================================================================

I don't have the copyright agreement in place yet, but it seems to me
that I'm not introducing any copyrightable material in this initial patch--
I'm just moving existing code around.
(okay I did add a comment at the top of each function).

Let me know if I'm doing this right...

--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug dynamic-link/15310] _dl_sort_fini is O(n^3) causing slow exit when many dsos

giuliomoro at yahoo dot it
In reply to this post by giuliomoro at yahoo dot it
http://sourceware.org/bugzilla/show_bug.cgi?id=15310

Don Hatch <dhatch at ilm dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Depends on|                            |15309

--- Comment #12 from Don Hatch <dhatch at ilm dot com> 2013-03-28 10:19:01 UTC ---
Actually I think the fix for bug 15309 (dl_open_worker doesn't fully initialize
seen array during init sort) needs to go in first...
it's a 1-liner and is easily patchable to various downstream code bases,
but that won't be true if it ends up on top of my code rearrangement.

Also I can't truthfully say "No functional change here"
unless the fix for 15309 goes in first, since I'm combining code
from two functions which aren't exactly the same yet...
they will be the same, after the fix for 15309 is in place.

--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug dynamic-link/15310] _dl_sort_fini is O(n^3) causing slow exit when many dsos

giuliomoro at yahoo dot it
In reply to this post by giuliomoro at yahoo dot it
http://sourceware.org/bugzilla/show_bug.cgi?id=15310

--- Comment #13 from law at redhat dot com 2013-03-28 17:09:25 UTC ---
Actually, I'm pretty sure the problematical sort shows up in 3 places:

dl_map_object_deps
dl_sort_fini
dl_open_worker

--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug dynamic-link/15310] _dl_sort_fini is O(n^3) causing slow exit when many dsos

giuliomoro at yahoo dot it
In reply to this post by giuliomoro at yahoo dot it
http://sourceware.org/bugzilla/show_bug.cgi?id=15310

--- Comment #14 from Don Hatch <dhatch at ilm dot com> 2013-03-28 17:31:26 UTC ---
(In reply to comment #13)
> Actually, I'm pretty sure the problematical sort shows up in 3 places:
>
> dl_map_object_deps
> dl_sort_fini
> dl_open_worker

that's correct. my initial proposed patch combines two of them
into a single function _dl_sort_init.

--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug dynamic-link/15310] _dl_sort_fini is O(n^3) causing slow exit when many dsos

giuliomoro at yahoo dot it
In reply to this post by giuliomoro at yahoo dot it
http://sourceware.org/bugzilla/show_bug.cgi?id=15310

--- Comment #15 from Don Hatch <dhatch at ilm dot com> 2013-04-02 09:54:17 UTC ---
Progress on unit/stress test...
I got something working, in such a way that it can go cleanly into the build
(assuming the init sort has been separated out into a function,
as in my "Proposed initial patch" attachment).
I'll submit the test as soon as I get it polished (and I get legally cleared).

It tests _dl_sort_init on all 17854749 graphs of up to 4 nodes in 1min8sec,
and _dl_sort_fini on all 16777846 pairs of graphs
(static dep graph and dynamic dep graph) of up to 3 nodes in 48sec
(on an Intel Xeon L5630 @ 2.13GHz which is a pretty fast machine I guess).
That's probably a bit long for a confidence test run during "make check",
so for that, I'd probably do something less,
augmented by some randomized testing (with deterministic seed of course).

As Carlos pointed out in the referenced email thread,
this test would have prevented the last several generations
of bugs in these sorting routines, and it should help ensure that
the impending overhaul properly fixes the bugs it intends to fix, without
introducing new ones, and that no similar bugs appear in the future.

In addition to lots of expected failures of _dl_sort_fini due to bug 15311,
the testing of _dl_sort_init on graphs up to 4 nodes
revealed that the existing algorithm doesn't keep SCCs contiguous,
which was a surprise to me.  I just opened bug 15329 on that.
(Not sure whether that's technically a bug... but it's certainly
not a nice behavior, and we can do better.)

Is it okay to add a test that currently fails
due to bugs that we intend to fix very soon?

--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
12