gdbarch_init, ABI, and registers

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

gdbarch_init, ABI, and registers

Tim Newsome
Still working on RISC-V support…

I’ve taught OpenOCD to provide a target description, so now when connecting
to a target that doesn’t have an FPU, the FPU registers don’t show up.
Obviously I want this reflected in gdb, and I’ve managed to make that work
in some cases. However, it doesn’t always work. Sometimes my target appears
to revert to the case where it just uses the built-in list of registers,
despite being connected to an OpenOCD that I know provides the target
description.

My confusion comes from when riscv_gdbarch_init() is called, and how. With
a bit of instrumenting, I get this output:

+ /opt/riscv/bin/riscv64-unknown-elf-gdb
>>> riscv_gdbarch_init()
>>>     bits_per_word=64
>>>     tdesc_has_register() -> 0
>>>     info.abfd=(nil)
>>>     abi=1073741824
GNU gdb (GDB) 8.0.50.20170724-git
...
(gdb) target extended-remote localhost:38295

target extended-remote localhost:38295
Remote debugging using localhost:38295
>>> riscv_gdbarch_init()
>>>     bits_per_word=64
>>>     tdesc_has_register() -> 1
>>>     info.abfd=(nil)
>>>     abi=1073741824
warning: No executable has been specified and target does not support
determining executable automatically.  Try using the "file" command.
0x00000000800009f0 in ?? ()
(gdb) file HiFive1_debug-32
>>> riscv_gdbarch_init()

>>>     bits_per_word=32
>>>     tdesc_has_register() -> 1
>>>     info.abfd=0x55a8a1f4e7f0
>>>     flavour=5
>>>     abi=0
>>> riscv_gdbarch_init()
>>>     bits_per_word=32
>>>     tdesc_has_register() -> 0
>>>     info.abfd=0x55a8a1f4e7f0
>>>     flavour=5
>>>     abi=0
Reading symbols from HiFive1_debug-32...done.
(gdb) set arch riscv:rv32
>>> riscv_gdbarch_init()
>>>     bits_per_word=32
>>>     tdesc_has_register() -> 1
>>>     info.abfd=0x55a8a1f4e7f0
>>>     flavour=5
>>>     abi=0
The target architecture is assumed to be riscv:rv32

It almost feels like I’m on completely the wrong track by initializing
registers in riscv_gdbarch_init(), but that’s what the ARM and MIPS targets
do. A lot of code in their gdbarch_init() is dedicated to figuring out the
ABI in use. How are ABIs and registers related? In my world view the
hardware has certain registers, and how the software chooses to use those
registers is the ABI. But changing the ABI doesn’t change what registers
exist.

Are there callbacks I can use that are called when we connect to/disconnect
from a remote server like OpenOCD?

Thank you,
Tim

Reply | Threaded
Open this post in threaded view
|

Handling language trampoline

Dmitry Antipov
When debugging a program which is definitely in C++:

(gdb) info source
[...skipped...]
Contains 66 lines.
Source language is c++.
Producer is clang version 6.0.0 (trunk 319884).
Compiled with DWARF 2 debugging format.
Does not include preprocessor macro info.

I've noticed that 'step' command causes GDB to perform some ObjC-specific work:

#0  lookup_minimal_symbol (name=0x8459a1 "_objc_msgSend", sfile=sfile@entry=0x0, objf=objf@entry=0x0) at ../../gdb/minsyms.c:313
#1  0x00000000005bd0f9 in lookup_bound_minimal_symbol (name=<optimized out>) at ../../gdb/minsyms.c:432
#2  0x00000000005c1666 in find_objc_msgsend () at ../../gdb/objc-lang.c:1282
#3  find_objc_msgcall (pc=pc@entry=139646390853344, new_pc=0x7ffe10329598) at ../../gdb/objc-lang.c:1340
#4  0x00000000005c1820 in objc_skip_trampoline (frame=0x16a0eb0, stop_pc=139646390853344) at ../../gdb/objc-lang.c:313
#5  0x000000000059f1dc in skip_language_trampoline (frame=frame@entry=0x16a0eb0, pc=139646390853344) at ../../gdb/language.c:605
#6  0x0000000000597129 in process_event_stop_test (ecs=ecs@entry=0x7ffe10329d10) at ../../gdb/infrun.c:6706
#7  0x0000000000598a10 in handle_signal_stop (ecs=ecs@entry=0x7ffe10329d10) at ../../gdb/infrun.c:6163
#8  0x000000000059a178 in handle_inferior_event_1 (ecs=0x7ffe10329d10) at ../../gdb/infrun.c:5352
#9  handle_inferior_event (ecs=ecs@entry=0x7ffe10329d10) at ../../gdb/infrun.c:5387
#10 0x000000000059af78 in fetch_inferior_event (client_data=<optimized out>) at ../../gdb/infrun.c:3903
#11 0x000000000055c89d in gdb_wait_for_event (block=block@entry=0) at ../../gdb/event-loop.c:859
#12 0x000000000055ca6f in gdb_do_one_event () at ../../gdb/event-loop.c:322
#13 0x000000000055cb5e in gdb_do_one_event () at ../../gdb/event-loop.c:304
#14 start_event_loop () at ../../gdb/event-loop.c:371
#15 0x00000000005af348 in captured_command_loop () at ../../gdb/main.c:329
#16 0x00000000005b019d in captured_main (data=0x7ffe10329e30) at ../../gdb/main.c:1155
#17 gdb_main (args=args@entry=0x7ffe10329f60) at ../../gdb/main.c:1171
#18 0x0000000000408e15 in main (argc=<optimized out>, argv=<optimized out>) at ../../gdb/gdb.c:32

Why it is so if source language was recognized as C++?

Dmitry

Reply | Threaded
Open this post in threaded view
|

Re: Handling language trampoline

Pedro Alves-7
On 12/07/2017 02:21 PM, Dmitry Antipov wrote:

> When debugging a program which is definitely in C++:
>
> (gdb) info source
> [...skipped...]
> Contains 66 lines.
> Source language is c++.
> Producer is clang version 6.0.0 (trunk 319884).
> Compiled with DWARF 2 debugging format.
> Does not include preprocessor macro info.
>
> I've noticed that 'step' command causes GDB to perform some
> ObjC-specific work:


...

> Why it is so if source language was recognized as C++?

A program is often composed of sources written in different
languages (C++, C, Asm, etc.).  Plus, a trampoline itself has no
symbol/language associated.

> #0  lookup_minimal_symbol (name=0x8459a1 "_objc_msgSend", sfile=sfile@entry=0x0, objf=objf@entry=0x0) at ../../gdb/minsyms.c:313
> #1  0x00000000005bd0f9 in lookup_bound_minimal_symbol (name=<optimized out>) at ../../gdb/minsyms.c:432
> #2  0x00000000005c1666 in find_objc_msgsend () at ../../gdb/objc-lang.c:1282
> #3  find_objc_msgcall (pc=pc@entry=139646390853344, new_pc=0x7ffe10329598) at ../../gdb/objc-lang.c:1340
> #4  0x00000000005c1820 in objc_skip_trampoline (frame=0x16a0eb0, stop_pc=139646390853344) at ../../gdb/objc-lang.c:313
> #5  0x000000000059f1dc in skip_language_trampoline (frame=frame@entry=0x16a0eb0, pc=139646390853344) at ../../gdb/language.c:605
> #6  0x0000000000597129 in process_event_stop_test (ecs=ecs@entry=0x7ffe10329d10) at ../../gdb/infrun.c:6706
> #7  0x0000000000598a10 in handle_signal_stop (ecs=ecs@entry=0x7ffe10329d10) at ../../gdb/infrun.c:6163

Above, frame #4:

/* Iterate through all registered languages looking for and calling
   any non-NULL struct language_defn.skip_trampoline() functions.
   Return the result from the first that returns non-zero, or 0 if all
   `fail'.  */
CORE_ADDR
skip_language_trampoline (struct frame_info *frame, CORE_ADDR pc)
{
  for (const auto &lang : languages)
    {
      if (lang->skip_trampoline != NULL)
        {
          CORE_ADDR real_pc = lang->skip_trampoline (frame, pc);

          if (real_pc)
            return real_pc;
        }
    }

  return 0;
}

I don't offhand see how can GDB know which is the right
language for the current PC the program just stopped at, and
if the program stopped inside a trampoline.  That's part of
each language's skip_trampoline's job, so seems reasonable
that GDB has to try them all.

I'm guessing those minsym lookups showed up high in profile?
I guess that could be solved with some per-objfile
"minsym-of-_objc_msgSend" caching.  Something like
breakpoint.c:breakpoint_objfile_data.

Thanks,
Pedro Alves

Reply | Threaded
Open this post in threaded view
|

Re: gdbarch_init, ABI, and registers

Tim Newsome
In reply to this post by Tim Newsome
I’ve made some progress here.

gdb does keep track of the description for the current target, and it can
be retrieved by calling target_current_description(). My problems stemmed
from the fact that sometimes riscv_gdbarch_init() was called with an info
structure that did not have the current target description filled out. I
tracked this down to gdbarch_from_bfd(), which doesn’t set
info.target_desc. set_gdbarch_from_file() does do so (since 2008), and the
following patch makes everything work for me:

diff --git a/gdb/arch-utils.c b/gdb/arch-utils.c
index 2ae3413087..6c84f100af 100644
--- a/gdb/arch-utils.c
+++ b/gdb/arch-utils.c
@@ -600,6 +600,7 @@ gdbarch_from_bfd (bfd *abfd)
   gdbarch_info_init (&info);

   info.abfd = abfd;
+  info.target_desc = target_current_description ();
   return gdbarch_find_by_info (info);
 }

Does this seem like the right solution? A better one might be to put this
assignment in gdbarch_info_init(). Or I could just call
target_current_description() in riscv_arch_init() when no target
description is passed in. The latter goes against the comment accompanying
target_current_description(), but it would only be a target-dependent
change.

Tim


On Wed, Dec 6, 2017 at 12:20 PM, Tim Newsome <[hidden email]> wrote:

> Still working on RISC-V support…
>
> I’ve taught OpenOCD to provide a target description, so now when
> connecting to a target that doesn’t have an FPU, the FPU registers don’t
> show up. Obviously I want this reflected in gdb, and I’ve managed to make
> that work in some cases. However, it doesn’t always work. Sometimes my
> target appears to revert to the case where it just uses the built-in list
> of registers, despite being connected to an OpenOCD that I know provides
> the target description.
>
> My confusion comes from when riscv_gdbarch_init() is called, and how. With
> a bit of instrumenting, I get this output:
>
> + /opt/riscv/bin/riscv64-unknown-elf-gdb
> >>> riscv_gdbarch_init()
> >>>     bits_per_word=64
> >>>     tdesc_has_register() -> 0
> >>>     info.abfd=(nil)
> >>>     abi=1073741824
> GNU gdb (GDB) 8.0.50.20170724-git
> ...
> (gdb) target extended-remote localhost:38295
>
> target extended-remote localhost:38295
> Remote debugging using localhost:38295
> >>> riscv_gdbarch_init()
> >>>     bits_per_word=64
> >>>     tdesc_has_register() -> 1
> >>>     info.abfd=(nil)
> >>>     abi=1073741824
> warning: No executable has been specified and target does not support
> determining executable automatically.  Try using the "file" command.
> 0x00000000800009f0 in ?? ()
> (gdb) file HiFive1_debug-32
> >>> riscv_gdbarch_init()
>
> >>>     bits_per_word=32
> >>>     tdesc_has_register() -> 1
> >>>     info.abfd=0x55a8a1f4e7f0
> >>>     flavour=5
> >>>     abi=0
> >>> riscv_gdbarch_init()
> >>>     bits_per_word=32
> >>>     tdesc_has_register() -> 0
> >>>     info.abfd=0x55a8a1f4e7f0
> >>>     flavour=5
> >>>     abi=0
> Reading symbols from HiFive1_debug-32...done.
> (gdb) set arch riscv:rv32
> >>> riscv_gdbarch_init()
> >>>     bits_per_word=32
> >>>     tdesc_has_register() -> 1
> >>>     info.abfd=0x55a8a1f4e7f0
> >>>     flavour=5
> >>>     abi=0
> The target architecture is assumed to be riscv:rv32
>
> It almost feels like I’m on completely the wrong track by initializing
> registers in riscv_gdbarch_init(), but that’s what the ARM and MIPS targets
> do. A lot of code in their gdbarch_init() is dedicated to figuring out the
> ABI in use. How are ABIs and registers related? In my world view the
> hardware has certain registers, and how the software chooses to use those
> registers is the ABI. But changing the ABI doesn’t change what registers
> exist.
>
> Are there callbacks I can use that are called when we connect
> to/disconnect from a remote server like OpenOCD?
>
> Thank you,
> Tim
> ​
>
Reply | Threaded
Open this post in threaded view
|

Re: Handling language trampoline

Dmitry Antipov
In reply to this post by Pedro Alves-7
On 12/07/2017 05:58 PM, Pedro Alves wrote:

> I don't offhand see how can GDB know which is the right
> language for the current PC the program just stopped at, and
> if the program stopped inside a trampoline.  That's part of
> each language's skip_trampoline's job, so seems reasonable
> that GDB has to try them all.

I'm not an expert in this area too, but, in theory, what's the
problem if we have (presumably valid) DWARF info? Looking through
DWARF4 specs, each CU should have DW_AT_low_pc and DW_AT_high_pc;
so, if CU->DW_AT_low_pc <= current PC <= CU->DW_AT_high_pc, then
CU->DW_AT_language is the language in question, isn't it?

Dmitry
Reply | Threaded
Open this post in threaded view
|

Re: Handling language trampoline

Pedro Alves-7
On 12/08/2017 06:39 AM, Dmitry Antipov wrote:

> On 12/07/2017 05:58 PM, Pedro Alves wrote:
>
>> I don't offhand see how can GDB know which is the right
>> language for the current PC the program just stopped at, and
>> if the program stopped inside a trampoline.  That's part of
>> each language's skip_trampoline's job, so seems reasonable
>> that GDB has to try them all.
>
> I'm not an expert in this area too, but, in theory, what's the
> problem if we have (presumably valid) DWARF info? Looking through
> DWARF4 specs, each CU should have DW_AT_low_pc and DW_AT_high_pc;
> so, if CU->DW_AT_low_pc <= current PC <= CU->DW_AT_high_pc, then
> CU->DW_AT_language is the language in question, isn't it?

Some trampolines are generated by the compiler/linker, and I
assume there's no debug info for them?  E.g., for C++, see
gnuv3_skip_trampoline, and virtual thunks.

I don't know whether that's the case for objc, but from:

 /* Determine if we are currently in the Objective-C dispatch function.
    If so, get the address of the method function that the dispatcher
    would call and use that as the function to step into instead.  Also
    skip over the trampoline for the function (if any).  This is better
    for the user since they are only interested in stepping into the
    method function anyway.  */
 static CORE_ADDR
 objc_skip_trampoline (struct frame_info *frame, CORE_ADDR stop_pc)
 {

I'd assume that the "dispatch function" is a part of the objc runtime and
that this is meant to work if there's no debug info for the objc runtime
installed.

Thanks,
Pedro Alves
Reply | Threaded
Open this post in threaded view
|

Re: gdbarch_init, ABI, and registers

Ulrich Weigand
In reply to this post by Tim Newsome
Tim Newsome wrote:

> I've made some progress here.
>
> gdb does keep track of the description for the current target, and it can
> be retrieved by calling target_current_description(). My problems stemmed
> from the fact that sometimes riscv_gdbarch_init() was called with an info
> structure that did not have the current target description filled out. I
> tracked this down to gdbarch_from_bfd(), which doesn't set
> info.target_desc. set_gdbarch_from_file() does do so (since 2008), and the
> following patch makes everything work for me:
>
> diff --git a/gdb/arch-utils.c b/gdb/arch-utils.c
> index 2ae3413087..6c84f100af 100644
> --- a/gdb/arch-utils.c
> +++ b/gdb/arch-utils.c
> @@ -600,6 +600,7 @@ gdbarch_from_bfd (bfd *abfd)
>    gdbarch_info_init (&info);
>
>    info.abfd =3D abfd;
> +  info.target_desc =3D target_current_description ();
>    return gdbarch_find_by_info (info);
>  }
>
> Does this seem like the right solution? A better one might be to put this
> assignment in gdbarch_info_init(). Or I could just call
> target_current_description() in riscv_arch_init() when no target
> description is passed in. The latter goes against the comment accompanying
> target_current_description(), but it would only be a target-dependent
> change.

No, this doesn't look correct to me.  Note that it is normal during
GDB operation that several different gdbarch objects are in use,
which have somewhat different contents and are used for different
purposes.

In particular, there are gdbarch objects that describe a *target*
(i.e. an inferior GDB is operating on) -- those use target descriptions,
and care about registers etc.

But then there are also gdbarch objects that are used solely when
looking at an *object file*, without involving any connection to a
target -- those don't have target descriptions, and don't care about
registers.

The second type is generated by gdbarch_from_bfd, so those *should*
not get any target description.  This function can be called even
when there is no target currently active at all.

[ As an aside, it would probably be cleaner to actually use two
different data structures for those different purposes in the first
place.  This hasn't been done yet, mostly because it would be a lot
of effort to update all places where platform-specific gdbarch
information is created ... ]


I guess I still haven't quite understood why exactly any of this
is causing a problem for you.  Yes, gdbarch objects returned from
gdbarch_from_bfd will not have correct register info.  But those
objects also should never be used in any context where registers
matter.  Can you be more specific what the actual problem you're
seeing is?

Bye,
Ulrich

--
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: gdbarch_init, ABI, and registers

Tim Newsome
On Fri, Dec 8, 2017 at 4:20 AM, Ulrich Weigand <[hidden email]> wrote:

> No, this doesn't look correct to me.  Note that it is normal during
> GDB operation that several different gdbarch objects are in use,
> which have somewhat different contents and are used for different
> purposes.
>

Ah! This is super helpful, and what I was missing. I'm adding it as a
comment to `struct gdbarch`.

Am I right in that riscv_gdbarch_init() can differentiate these two cases
based on whether a target description is passed in or not? Eg. if there is
a target description, register structures need to be set up, and if there
isn't then that's not necessary?

I guess I still haven't quite understood why exactly any of this
> is causing a problem for you.  Yes, gdbarch objects returned from
> gdbarch_from_bfd will not have correct register info.  But those
> objects also should never be used in any context where registers
> matter.  Can you be more specific what the actual problem you're
> seeing is?
>

The problem I was seeing is that registers were showing up which shouldn't.
The reason (as I understand it now) is that I was depending on global
variables in riscv-tdep.c instead of putting them in gdbarch.data. I'll
make that change, and hopefully then everything will be better.

Thank you,
Tim
Reply | Threaded
Open this post in threaded view
|

Re: gdbarch_init, ABI, and registers

Ulrich Weigand
Tim Newsome wrote:

> Am I right in that riscv_gdbarch_init() can differentiate these two cases
> based on whether a target description is passed in or not? Eg. if there is
> a target description, register structures need to be set up, and if there
> isn't then that's not necessary?

Yes, exactly.

> The problem I was seeing is that registers were showing up which shouldn't.
> The reason (as I understand it now) is that I was depending on global
> variables in riscv-tdep.c instead of putting them in gdbarch.data. I'll
> make that change, and hopefully then everything will be better.

I see.  Global variables would indeed explain the problem ...


Bye,
Ulrich

--
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  [hidden email]