tcbhead_t gdb access for nonthreaded, gdb for longjmp()

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

tcbhead_t gdb access for nonthreaded, gdb for longjmp()

Jan Kratochvil-2
Hi,

crossposted to gdb+glibc as the patches closely correlate:

currently gdb/glibc cannot access TLS of debugged processes if compiled without
libpthread (nonthreaded).  Command line `print errno' produces one of:
        Cannot find thread-local variables on this target
        Cannot access memory at address 0x8
        [ gdb-20060908-tls-0.patch: gdb.threads/tls-print.exp ]

It also affects stepping over longjmp() as gdb calculates target address which
is PTR_MANGLE()d through TLS-based `pointer_guard' magic, as `next' will now:
        Cannot insert breakpoint -12.
        Error accessing memory address 0xb9227d3b: Input/output error.
        [ gdb-20060908-tls-0.patch: gdb.threads/tls-longjmp.exp ]

I would like to get approval of this design acceptance to finish the details.

glibc part:

 * Provide some access to the `tcbhead_t.pointer_guard' field for gdb.
   Currently implemented by `td_thr_getxregs' providing only `pointer_guard'.
   New non-Solaris `td_thr_*' function could be provided instead.

 * All the `libthread_db' functions accessing inferior's `_thread_db*' symbols
   of `libpthread' fallback to the new `_local_db*' symbols in `libthread_db'
   itself. `libthread_db'<=>`libpthread' versions must match anyway.
   I admit I do not know how may `libthread_db' and `libpthread' as there is
   already required in `td_ta_new' their versions match.  Anyway it should be
   enough for 99% of cases - as the fallback option.

gdb part:

 * `longjmp' decoder attempts to use `td_thr_getxregs', otherwise fallbacks
   to get the TLS base by `ps_get_thread_area' and the offset value
   `offsetof (tcbhead_t, pointer_guard)' from debuginfo, otherwise fallbacks
   to internal constant offset.

 * `SEC_THREAD_LOCAL' symbols are processed as a new expression data type.

 * TLS variables access uses legacy `thread_db_get_thread_local_address'
   as it depends on the `_local_db*' fallback implementation for nonthreaded
   processes missing `libthread_db' with the legacy `_thread_db*' symbols.


`longjmp' decoder can cope without glibc support by using debuginfo instead.
I would rather like to drop this workaround and rely on the glibc support.

TLS access needs the attached glibc patch for the nonthreaded processes.
I could provide gdb decoding without glibc support but I do not like it.

It works now only on i386, x86_64 to be debugged if it is accepted this way.
Patches still contain several FIXMEs; their fixes should not change the design.
glibc part should be arch-dependent, it will now fail to compile on arches
without existing `pointer_guard'.

Behavior changes depending on:
 * Application linked with libpthread or without libpthread.
 * glibc original/patched by this TLS extension.
 * -ggdb3 (overriding the TLS `errno' resolving just by the macro text).
 * Debuginfo availability for libpthread (for `longjmp').
Unfortunately the system changes cannot be tested just by the gdb testsuite.


Thanks,
Jan

glibc-20060828T1903-26-local_db+xregs-0.patch (23K) Download Attachment
gdb-20060908-tls-0.patch (35K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: tcbhead_t gdb access for nonthreaded, gdb for longjmp()

Daniel Jacobowitz-2
Hi Jan, thanks for working on this.

On Fri, Sep 08, 2006 at 12:22:35PM +0200, Jan Kratochvil wrote:

> glibc part:
>
>  * Provide some access to the `tcbhead_t.pointer_guard' field for gdb.
>    Currently implemented by `td_thr_getxregs' providing only `pointer_guard'.
>    New non-Solaris `td_thr_*' function could be provided instead.
>
>  * All the `libthread_db' functions accessing inferior's `_thread_db*' symbols
>    of `libpthread' fallback to the new `_local_db*' symbols in `libthread_db'
>    itself. `libthread_db'<=>`libpthread' versions must match anyway.
>    I admit I do not know how may `libthread_db' and `libpthread' as there is
>    already required in `td_ta_new' their versions match.  Anyway it should be
>    enough for 99% of cases - as the fallback option.

Your new libthread_db will accept any version of glibc, even one which
does not match - that seems like a good way to get in a lot of trouble.

I wonder if we really need to use libthread_db here anyway.  The
original goal of libthread_db, as I understand it, was to abstract away
the internals of the threading library from its higher level concepts;
for instance, you weren't supposed to have to know how to map threads
to LWPs, or how to find the locks owned by a thread (on Solaris's,
glibc's doesn't implement that).  This is a C library internal, with
not much to do with threads except that most platforms happen to use a
TLS address.

We're interested in "is there a pointer guard" and "is it used for
the PC value in setjmp/longjmp".  We need both pieces of information,
because some targets do and some don't; ia64 only encrypts rp, for
instance, not sp or pc.  We could provide those two bits of constant
information in libc somewhere.

Alternatively, since I don't see anything else in glibc that mangles
pointers which GDB would need to know about (except maybe atexit
functions, which might be nice to display someday?) we could provide
a function in glibc which we could call that would return the target
of a jmp_buf.  Then GDB wouldn't have to know how PTR_MANGLE worked.

Glibc maintainers, does that last idea sound practical?  It's much
simpler.  It'll take up a dozen or so bytes at runtime, hopefully not
paged in depending where they're linked.


--
Daniel Jacobowitz
CodeSourcery
Reply | Threaded
Open this post in threaded view
|

Re: tcbhead_t gdb access for nonthreaded, gdb for longjmp()

Jan Kratochvil-2
Hi Daniel,

On Sun, 10 Sep 2006 16:27:24 +0200, Daniel Jacobowitz wrote:

> On Fri, Sep 08, 2006 at 12:22:35PM +0200, Jan Kratochvil wrote:
> > glibc part:
> >
> >  * Provide some access to the `tcbhead_t.pointer_guard' field for gdb.
> >    Currently implemented by `td_thr_getxregs' providing only `pointer_guard'.
> >    New non-Solaris `td_thr_*' function could be provided instead.
> >
> >  * All the `libthread_db' functions accessing inferior's `_thread_db*' symbols
> >    of `libpthread' fallback to the new `_local_db*' symbols in `libthread_db'
> >    itself. `libthread_db'<=>`libpthread' versions must match anyway.
> >    I admit I do not know how may `libthread_db' and `libpthread' as there is
> >    already required in `td_ta_new' their versions match.  Anyway it should be
> >    enough for 99% of cases - as the fallback option.
>
> Your new libthread_db will accept any version of glibc, even one which
> does not match - that seems like a good way to get in a lot of trouble.

Is it fine to make `__libc_version' public?
`libthread_db' would check for it if running in `TD_MINIMALONLY' mode.


> I wonder if we really need to use libthread_db here anyway.

Besides longjmp() target PTR_DEMANGLE()ing there is also need to access glibc
TLS symbol `errno' but - in fact - AFAIK no other TLS symbol.
One may say there is no TLS `errno' for nonthreaded programs as they use only
        (*__errno_location ())

But from the user perspective I consider not acceptable to say something like
        No symbol "errno" in current context.

Also one cannot expect everyone will compile everything with bloated -ggdb3 to
automatically expand `errno' to `(*__errno_location ())'.

I see multiple solutions (maybe first one enough?).

 * Hardcoding `#define errno (*__errno_location ())' into gdb.

 * Providing full custom TLS resolving for gdb - no glibc change needed.

 * Extending current glibc libthread_db for non-threaded inferiors.
   (the patch sent before <[hidden email]>)

 * Merging the basic libpthread part for TLS resolving into glibc core
   as nonthreaded glibc is using the TLS support for threading anyway.

 * Merging the whole libpthread to glibc, making libpthread just a stub,
   forgetting there ever existed nonthreaded programs before,
   the same way UP (vs. SMP) was forgotten.

...
> a function in glibc which we could call that would return the target
> of a jmp_buf.  Then GDB wouldn't have to know how PTR_MANGLE worked.

Nice idea.


Regards,
Jan
Reply | Threaded
Open this post in threaded view
|

Re: tcbhead_t gdb access for nonthreaded, gdb for longjmp()

Daniel Jacobowitz-2
On Wed, Sep 13, 2006 at 03:05:32PM +0200, Jan Kratochvil wrote:
> > I wonder if we really need to use libthread_db here anyway.
>
> Besides longjmp() target PTR_DEMANGLE()ing there is also need to access glibc
> TLS symbol `errno' but - in fact - AFAIK no other TLS symbol.

You're combining two different problems here, and they're very
different.  The pointer_guard lives in the TCB, which is special
(not normal TLS data).  Normal TLS mechanisms won't work to find
it.

But the errno value lives in a standard TLS block.  All the application
needs to access it is the module number for libc.so.6 and the symbol
value.  GDB shouldn't access TLS the same way the application does
(by calling __tls_get_addr, which might e.g. cause allocation of a new
TLS block).  But it could find the DTV directly and perform its own
lookup, based on knowledge of the platform ABI.  (You mentioned earlier
not knowing what the DTV was; if that's still the case, please read
Ulrich's TLS paper, which explains it very well).

The symbol value's easily available in the symbol table.  The module
number is harder.  It's in the result from dl_iterate_phdr, which is
workable but very awkward for GDB to use.  And it's in the link_map,
but not at a public offset, so we can't find it there.

Options I see:
  - Make GDB call dl_iterate_phdr to get the module numbers.
  - Provide them in the public portion of the link map.
  - Provide a function in ld.so to translate a link map into its TLS
    module ID, for gdb use.

I'd be interested to hear from the glibc maintainers if they thought
any of those were workable.  The first is the ugliest, but most
immediately usable, since dl_iterate_phdr is already available today.

> I see multiple solutions (maybe first one enough?).
>
>  * Hardcoding `#define errno (*__errno_location ())' into gdb.

I'd rather not be specific to errno.

>  * Providing full custom TLS resolving for gdb - no glibc change needed.

I think this is what I described above.

>  * Extending current glibc libthread_db for non-threaded inferiors.
>    (the patch sent before <[hidden email]>)

Reasonable, except that it seems like a huge patch for limited gain.

>  * Merging the basic libpthread part for TLS resolving into glibc core
>    as nonthreaded glibc is using the TLS support for threading anyway.

This could probably be made to work...

>  * Merging the whole libpthread to glibc, making libpthread just a stub,
>    forgetting there ever existed nonthreaded programs before,
>    the same way UP (vs. SMP) was forgotten.

This would work :-)

--
Daniel Jacobowitz
CodeSourcery
Reply | Threaded
Open this post in threaded view
|

Re: tcbhead_t gdb access for nonthreaded, gdb for longjmp()

Jan Kratochvil-2
Hi,

also regarding making `__libc_version' public - it would be even useful to
check matching version of libc vs. libpthread as currently mixing various
versions has unpolite results.


On Wed, 13 Sep 2006 15:19:48 +0200, Daniel Jacobowitz wrote:
...
> GDB shouldn't access TLS the same way the application does (by calling
> __tls_get_addr, which might e.g. cause allocation of a new TLS block).

glibc nptl_db already accesses inferior TLS IMO in a safe unmodifying way.
Everything through td_thr_tls_get_addr() ...
Without libthread_db support the TLS base can be queried by
ps_get_thread_area(), using read_register() etc., done in the patch for TCB
`pointer_guard'. Accessing DTV fields is just some indirection, the problem
is mapping the module address to module id as you describe below.


> But it could find the DTV directly and perform its own
> lookup, based on knowledge of the platform ABI.

(Probably described above...)


> The symbol value's easily available in the symbol table.  The module
> number is harder.  It's in the result from dl_iterate_phdr, which is
> workable but very awkward for GDB to use.  And it's in the link_map,
> but not at a public offset, so we can't find it there.
>
> Options I see:
>   - Make GDB call dl_iterate_phdr to get the module numbers.

It would be needed to be called remotely in the inferior process, wouldn't be?

>   - Provide them in the public portion of the link map.

`libpthread.so' already contains public `_thread_db_link_map_l_tls_modid',
doesn't it just mean moving (or appropriately providing) this public symbol
from `libpthread.so' to `libc.so'?

>   - Provide a function in ld.so to translate a link map into its TLS
>     module ID, for gdb use.

I hope you intend function callable from the gdb process; I hope it is clear
calling inferior's function (using dummy frame?) is not suitable.

...
> >  * Providing full custom TLS resolving for gdb - no glibc change needed.
>
> I think this is what I described above.

Partially, in this case I would choose "link map" access with providing
target-dependent gdb-side link map offset (ugly, I know).


On Wed, 13 Sep 2006 15:19:48 +0200, Daniel Jacobowitz wrote:
> On Wed, Sep 13, 2006 at 03:05:32PM +0200, Jan Kratochvil wrote:
...
> > Besides longjmp() target PTR_DEMANGLE()ing there is also need to access glibc
> > TLS symbol `errno' but - in fact - AFAIK no other TLS symbol.
>
> You're combining two different problems here,

[
Sincerely thanks for pointing out my possible mistake.
Just in general - I know the differences, my patches would not work otherwise;
I know DTV, I just did not remember that acronym it is that structure before.
]



Regards,
Jan
Reply | Threaded
Open this post in threaded view
|

Re: tcbhead_t gdb access for nonthreaded, gdb for longjmp()

Daniel Jacobowitz-2
On Wed, Sep 13, 2006 at 08:37:05PM +0200, Jan Kratochvil wrote:

> On Wed, 13 Sep 2006 15:19:48 +0200, Daniel Jacobowitz wrote:
> ...
> > GDB shouldn't access TLS the same way the application does (by calling
> > __tls_get_addr, which might e.g. cause allocation of a new TLS block).
>
> glibc nptl_db already accesses inferior TLS IMO in a safe unmodifying way.
> Everything through td_thr_tls_get_addr() ...
> Without libthread_db support the TLS base can be queried by
> ps_get_thread_area(), using read_register() etc., done in the patch for TCB
> `pointer_guard'. Accessing DTV fields is just some indirection, the problem
> is mapping the module address to module id as you describe below.

I think you've missed my point here: we can get at errno in a single
threaded program by bypassing libthread_db.  The DTV is not an
indirection; td_thr_tls_get_addr is an abstraction on top of direct DTV
access.

> > The symbol value's easily available in the symbol table.  The module
> > number is harder.  It's in the result from dl_iterate_phdr, which is
> > workable but very awkward for GDB to use.  And it's in the link_map,
> > but not at a public offset, so we can't find it there.
> >
> > Options I see:
> >   - Make GDB call dl_iterate_phdr to get the module numbers.
>
> It would be needed to be called remotely in the inferior process, wouldn't be?

Yes.  This isn't hard.

> >   - Provide them in the public portion of the link map.
>
> `libpthread.so' already contains public `_thread_db_link_map_l_tls_modid',
> doesn't it just mean moving (or appropriately providing) this public symbol
> from `libpthread.so' to `libc.so'?

Well, right now that's only for thread_db usage.  I hadn't thought
about it, but we could use it directly from GDB; I guess that would
work fine.  We could move those thread-db related symbols which
provide information about libc (not libpthread) data structures to
libc.so and look them up directly.

I'm not sure how this would work with static linking however.  Being
able to do whatever we do for dynamic apps for static ones also would
be desirable.

Also, distributions generally strip symbols from libc.so; we'll have to
let them know to give it the same special treatment they already give
libpthread.so.

--
Daniel Jacobowitz
CodeSourcery