Function address incoherence

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Function address incoherence

Andrea Cardaci
Hi,

I could use some clarification about the following behavior that
initially came up in a issue [1] opened to the gdb-dashboard project.

Basically in some cases it appears that there's some kind of symbol
conflict (this is a bad name but I don't know how to else call it).

I can only reproduce it on a 32 bit Linux machine with the following:

    vagrant@ubuntu-xenial:~$ gcc -x c -o test - <<<'int main(){}'
    vagrant@ubuntu-xenial:~$ gdb -q --batch test -ex 'b *0' -ex r -ex
disassemble -ex 'p &_start'
    Breakpoint 1 at 0x0
    Warning:
    Cannot insert breakpoint 1.
    Cannot access memory at address 0x0

    Dump of assembler code for function _start:
    => 0xb7fdba20 <+0>:     mov    %esp,%eax
       0xb7fdba22 <+2>:     call   0xb7fdc5c0 <_dl_start>
    End of assembler dump.
    $1 = (<text variable, no debug info> *) 0x80482e0 <_start>

As you can see, $pc points to 0xb7fdba20 and the disassemble commands
reports that it is part of _start, yet if I print the address of
_start I obtain a different value (0x80482e0). (Using the disassemble
command is just an example, simply printing $pc shows 0xb7fdba20
<_start>.)

Now the latter (0x80482e0) is what appears to be the *real* _start:

    vagrant@ubuntu-xenial:~$ readelf -s test | grep ' _start'
       60: 080482e0     0 FUNC    GLOBAL DEFAULT   14 _start

While the other address (0xb7fdba20) appears to be contained in
/lib/ld-linux.so.2:

    0xb7fdb860 - 0xb7ff48fd is .text in /lib/ld-linux.so.2

According to objdump -d /lib/ld-linux.so.2, $pc points at the
following instructions in the _dl_rtld_di_serinfo function:

    00000860 <_dl_rtld_di_serinfo@@GLIBC_PRIVATE-0x83c0>:
        [...]
        a20:       89 e0                   mov    %esp,%eax
        a22:       e8 99 0b 00 00          call   15c0 <realloc@plt+0xd80>

So where does the _start reported by the disassemble command comes
from? Is this a GDB bug or I'm missing something here?

Back to the gdb-dashboard issue, I fetch the function address with
gdb.parse_and_eval(frame.name()).address and since frame.name() is
_start, the address that I obtain is 0x80482e0 instead of 0xb7fdba20
thus I end up displaying wrong offsets.

Best,
Andrea

[1]: https://github.com/cyrus-and/gdb-dashboard/issues/70
Reply | Threaded
Open this post in threaded view
|

Re: Function address incoherence

Florian Weimer-5
* Andrea Cardaci:

>     00000860 <_dl_rtld_di_serinfo@@GLIBC_PRIVATE-0x83c0>:

Please not the -0x83c0 offset.  The symbol information is not really
helpful, probably due to missing debugging information.

> So where does the _start reported by the disassemble command comes
> from? Is this a GDB bug or I'm missing something here?

Both the main program and the dynamic loader have a _start symbol.  The
GDB behavior you observed is typical for such symbol conflicts.

Thanks,
Florian
Reply | Threaded
Open this post in threaded view
|

Re: Function address incoherence

Andreas Schwab
In reply to this post by Andrea Cardaci
On Aug 24 2019, Andrea Cardaci <[hidden email]> wrote:

> As you can see, $pc points to 0xb7fdba20 and the disassemble commands
> reports that it is part of _start, yet if I print the address of
> _start I obtain a different value (0x80482e0). (Using the disassemble
> command is just an example, simply printing $pc shows 0xb7fdba20
> <_start>.)

The disassembler can chose the nearest symbols, but the expression
evaluator needs to resolve it to a single address, so it can only chose
one of the them.

Andreas.

--
Andreas Schwab, SUSE Labs, [hidden email]
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."
Reply | Threaded
Open this post in threaded view
|

Re: Function address incoherence

Andrea Cardaci
In reply to this post by Florian Weimer-5
On Mon, 26 Aug 2019 at 08:54, Florian Weimer <[hidden email]> wrote:

> Please not the -0x83c0 offset.  The symbol information is not really helpful, probably due to missing debugging information.

How can GDB then come up with _start then?

> Both the main program and the dynamic loader have a _start symbol.

How can I figure out that the dynamic loader have a _start symbol?
AFAIK there is none, I've checked with objdump, readelf, radare, etc.
Reply | Threaded
Open this post in threaded view
|

Re: Function address incoherence

Andrea Cardaci
In reply to this post by Andreas Schwab
On Mon, 26 Aug 2019 at 09:55, Andreas Schwab <[hidden email]> wrote:

> The disassembler can chose the nearest symbols, but the expression
> evaluator needs to resolve it to a single address, so it can only chose
> one of the them.

Would then be possible to obtain such list using a GDB command or the
Python API?
Reply | Threaded
Open this post in threaded view
|

Re: Function address incoherence

Sourceware - gdb list mailing list
In reply to this post by Andrea Cardaci
On Sat, Aug 24, 2019 at 5:27 AM Andrea Cardaci <[hidden email]> wrote:
> Back to the gdb-dashboard issue, I fetch the function address with
> gdb.parse_and_eval(frame.name()).address and since frame.name() is
> _start, the address that I obtain is 0x80482e0 instead of 0xb7fdba20
> thus I end up displaying wrong offsets.

Why don't you use frame.function() and get the address from there?

(and why parse_and_eval instead of lookup_symbol?)

On GDB trunk, you can look up symbols per-objfile, which can help if
multiple files have the same symbol.

Christian
Reply | Threaded
Open this post in threaded view
|

Re: Function address incoherence

Andrea Cardaci
On Mon, 26 Aug 2019 at 18:34, Christian Biesinger <[hidden email]> wrote:

> Why don't you use frame.function() and get the address from there?

This actually seems a good idea and I can't remember why I didn't end
up using that, let me try...

> (and why parse_and_eval instead of lookup_symbol?)

Because:

    >>> gdb.lookup_symbol('_start')
    (None, False)
    >>> gdb.parse_and_eval('_start')
    <gdb.Value object at 0x7f340fe3dcb0>

Maybe I need to specify a block different than the current.
Reply | Threaded
Open this post in threaded view
|

Re: Function address incoherence

Andrea Cardaci
On Mon, 26 Aug 2019 at 18:50, Andrea Cardaci <[hidden email]> wrote:

> This actually seems a good idea and I can't remember why I didn't end
> up using that, let me try...

Right, GDB does not always return symbol information, for example, in
the original use case frame.function() is None, yet GDB knows the
current function and its boundaries (according to disassemble and
print $pc).