[PATCH] Fix stepping bug associated with non-contiguous blocks

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH] Fix stepping bug associated with non-contiguous blocks

Kevin Buettner
I recently noticed the following behavior while debugging
dw2-ranges-func-low-cold.  This is one of the test programs associated
with the test gdb.dwarf2/dw2-ranges-func.exp.

(gdb) b 70
Breakpoint 1 at 0x401129: file dw2-ranges-func-lo-cold.c, line 70.
(gdb) run
Starting program: dw2-ranges-func-lo-cold

Breakpoint 1, foo ()
    at dw2-ranges-func-lo-cold.c:70
70  if (e) foo_cold (); /* foo foo_cold call */
(gdb) set var e=1
(gdb) step
[Inferior 1 (process 12545) exited normally]

This is incorrect.  When stepping, we expect a step to occur.  We do not
expect the program to exit.  Instead, we should see the following behavior:

...
(gdb) set var e=1
(gdb) step
foo ()
    at dw2-ranges-func-lo-cold.c:54
54  baz (); /* foo_cold baz call */

(Note that I've shortened the paths in the above sessions to improve
readability.)

The bug is in fill_in_stop_func() in infrun.c.  While working on
non-contiguous address range improvements in 2018, I replaced the
call to find_pc_partial_function() with a call to
find_function_entry_range_from_pc().  Although this seemed like the
right thing to do at the time, I now think that calling
find_pc_partial_function (along with some other tweaks) is the right
thing to do.

For blocks with a single contiguous range, these functions do pretty
much the same thing: when the function succeeds, the function name,
start address, and end address are all filled in.  Additionally,
find_pc_partial_function contains an additional output parameter
which is set to the block containing that PC.

For blocks with non-contiguous ranges, find_pc_partial_function
sets the start and end addresses to the start and end addresses
of the range containing the pc.  find_function_entry_range_from_pc
does what it says; it sets the start and end addresses to those
of the range containing the entry pc.

The reason that I had thought that using the entry pc range was
correct is due to the fact that fill_in_stop_func() contains some
code for advancing past the function start and entry point.  To do
this, we'd need the range that contains the entry pc.

However, when stepping, we actually want the range that contains the
stop pc.  If that range also contains the entry pc, we should then
attempt to advance stop_func_start past the start offset and entry
point.  (I haven't thought very hard about the reason for advancing
the stop_func_start in this manner.  Since it's been there for quite
a while, I'm assuming that it's still a good idea.)

Back when I wrote the test case, I had included a test for doing the
step shown in the example above.  I had problems with it, however.  At
the time, I thought it was due to differing compiler versions, so I
disabled that portion of the test.  I have now reenabled those tests,
but have left in place the logic which may be used to disable it.

The changes to dw2-ranges-func.exp depend on my other recent changes
to the file which have not been pushed yet.

Finally, I'll note that the only caller of
find_function_entry_range_from_pc() is/was fill_in_stop_func().  Once
this commit goes in, it'll be dead code.  I considered removing it,
but I think that it ought to be used (instead of
find_pc_partial_function) for determining the correct range to scan
for prologue analysis, so I'm going to leave it in place for now.

gdb/ChangeLog:

        * infrun.c (fill_in_stop_func): Use find_pc_partial_function
        instead of find_function_entry_range_from_pc.

testsuite/ChangeLog:

        * gdb.dwarf2/dw2-ranges-func.exp (enable_foo_cold_stepping):
        Enable tests associated with this flag.  Adjust regex
        referencing "foo_low" to now refer to "foo_cold" instead.
---
 gdb/infrun.c                                 | 37 ++++++++++++++------
 gdb/testsuite/gdb.dwarf2/dw2-ranges-func.exp | 37 +++++++++++++-------
 2 files changed, 51 insertions(+), 23 deletions(-)

diff --git a/gdb/infrun.c b/gdb/infrun.c
index 4fd92f1bac..3a2fe8f5a0 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -4115,18 +4115,35 @@ fill_in_stop_func (struct gdbarch *gdbarch,
 {
   if (!ecs->stop_func_filled_in)
     {
+      const block *block;
+
       /* Don't care about return value; stop_func_start and stop_func_name
  will both be 0 if it doesn't work.  */
-      find_function_entry_range_from_pc (ecs->event_thread->suspend.stop_pc,
-         &ecs->stop_func_name,
-         &ecs->stop_func_start,
- &ecs->stop_func_end);
-      ecs->stop_func_start
- += gdbarch_deprecated_function_start_offset (gdbarch);
-
-      if (gdbarch_skip_entrypoint_p (gdbarch))
- ecs->stop_func_start = gdbarch_skip_entrypoint (gdbarch,
- ecs->stop_func_start);
+      find_pc_partial_function (ecs->event_thread->suspend.stop_pc,
+ &ecs->stop_func_name,
+ &ecs->stop_func_start,
+ &ecs->stop_func_end,
+ &block);
+
+      /* The call to find_pc_partial_function, above, will set
+ stop_func_start and stop_func_end to the start and end
+ of the range containing the stop pc.  If this range
+ contains the entry pc for the block (which is always the
+ case for contiguous blocks), advance stop_func_start past
+ the function's start offset and entrypoint.  Note that
+ stop_func_start is NOT advanced when in a range of a
+ non-contiguous block that does not contain the entry pc.  */
+      if (block != nullptr
+  && ecs->stop_func_start <= BLOCK_ENTRY_PC (block)
+  && BLOCK_ENTRY_PC (block) < ecs->stop_func_end)
+ {
+  ecs->stop_func_start
+    += gdbarch_deprecated_function_start_offset (gdbarch);
+
+  if (gdbarch_skip_entrypoint_p (gdbarch))
+    ecs->stop_func_start
+      = gdbarch_skip_entrypoint (gdbarch, ecs->stop_func_start);
+ }
 
       ecs->stop_func_filled_in = 1;
     }
diff --git a/gdb/testsuite/gdb.dwarf2/dw2-ranges-func.exp b/gdb/testsuite/gdb.dwarf2/dw2-ranges-func.exp
index fdc488ae92..bd0564f188 100644
--- a/gdb/testsuite/gdb.dwarf2/dw2-ranges-func.exp
+++ b/gdb/testsuite/gdb.dwarf2/dw2-ranges-func.exp
@@ -395,20 +395,31 @@ proc do_test {suffix} {
  "foo(_label2)? \\(\\).*foo_cold \\(\\);.*foo foo_cold call.*" \
  "step out of bar to foo"
 
- # The tests in the "enable_foo_cold_stepping" section, below, work
- # with some versions of gcc, though it's not clear that they
- # should.  This test case causes foo_cold, originally a separate
- # function invoked via a subroutine call, to be considered as part
- # of foo via use of DW_AT_ranges.  Real code that I've looked at
- # uses a branch instruction to cause code in the "cold" range to
- # be executed.
+ # Tests in the "enable_foo_cold_stepping" section, below, did
+ # not work prior to July, 2019.  They had been disabled via
+ # use of the "enable_foo_cold_stepping" flag.
+ #
+ # As noted elsewhere, this test case causes foo_cold,
+ # originally a separate function invoked via a subroutine
+ # call, to be considered as part of foo via use of
+ # DW_AT_ranges.  Real code that I've looked at uses a branch
+ # instruction to cause code in the "cold" range to be
+ # executed.  These tests used to fail which is why they were
+ # disabled.
  #
- # For the moment though, these tests have been left in place, but
- # disabled, in case we decide that making such a subroutine call
- # is a reasonable thing to do that should also be supported by
- # GDB.
+ # After adding a "hi" cold test, I found that we were able to
+ # step into foo_cold from foo for the "hi" version, but for
+ # the "lo" version, GDB would run to either the next
+ # breakpoint or until the inferior exited when there were no
+ # breakpoints.  Not being able to step is definitely a bug
+ # even if it's unlikely that this problem would ever be hit in
+ # a real program.  Therefore, the bug was fixed in GDB and
+ # these tests are now enabled.
+ #
+ # I've left in place the flag (and test) which may be used to
+ # disable these tests.
 
- set enable_foo_cold_stepping false
+ set enable_foo_cold_stepping true
 
  if { $enable_foo_cold_stepping } {
     gdb_test_no_output "set variable e=1"
@@ -422,7 +433,7 @@ proc do_test {suffix} {
      "step to baz call in foo_cold"
 
  }
- -re "foo(_low)? \\(\\).*baz \\(\\);.*foo_cold baz call.*${gdb_prompt}" {
+ -re "foo(_cold)? \\(\\).*baz \\(\\);.*foo_cold baz call.*${gdb_prompt}" {
     pass $test
  }
     }
--
2.21.0

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Fix stepping bug associated with non-contiguous blocks

Tom Tromey-2
>>>>> "Kevin" == Kevin Buettner <[hidden email]> writes:

Kevin> gdb/ChangeLog:

Kevin> * infrun.c (fill_in_stop_func): Use find_pc_partial_function
Kevin> instead of find_function_entry_range_from_pc.

Thanks for the explanation.  I think this makes sense.

Kevin> +  ecs->stop_func_start
Kevin> +    += gdbarch_deprecated_function_start_offset (gdbarch);

This hook is only used by VAX.  I wonder if we could delete it somehow.

Tom
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Fix stepping bug associated with non-contiguous blocks

Kevin Buettner
On Mon, 15 Jul 2019 09:21:26 -0600
Tom Tromey <[hidden email]> wrote:

> >>>>> "Kevin" == Kevin Buettner <[hidden email]> writes:  
>
> Kevin> gdb/ChangeLog:  
>
> Kevin> * infrun.c (fill_in_stop_func): Use find_pc_partial_function
> Kevin> instead of find_function_entry_range_from_pc.  
>
> Thanks for the explanation.  I think this makes sense.

Thanks for the review.

> Kevin> +  ecs->stop_func_start
> Kevin> +    += gdbarch_deprecated_function_start_offset (gdbarch);  
>
> This hook is only used by VAX.  I wonder if we could delete it somehow.

I had to refresh my memory on what this was about.  On VAX, the CALL
instruction's target consists of a 16-bit entry mask followed by the
instructions that we'd normally expect to see.  The entry mask specifies
the registers which should be pushed (plus some other stuff too, I think.)

Do we still care about the VAX target?

If not, once the VAX target is deleted, that hook could be deleted too.

Kevin
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Fix stepping bug associated with non-contiguous blocks

Kevin Buettner
On Mon, 15 Jul 2019 14:11:16 -0700
Kevin Buettner <[hidden email]> wrote:

> Do we still care about the VAX target?

After spending a few minutes looking at things, I now have an
opinion...

VAX is, historically, a very important target.  For a long time (and
perhaps even now), it was considered to be the canonical example of a
CISC architecture.

Support still exists in GCC, though it's not clear to me if it
actually works.  I saw recent patches (from April, 2019) from someone
who is trying to get VAX support in GCC to work again.

Given that GCC still has code which provides VAX architecture
support, I think that GDB should do likewise.  I think that we
should attempt to not break it (anymore than it's already broken?),
but I also don't think we should attempt to test it beyond making
sure that it still builds.

Kevin
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Fix stepping bug associated with non-contiguous blocks

Tom Tromey-2
>>>>> "Kevin" == Kevin Buettner <[hidden email]> writes:

Kevin> After spending a few minutes looking at things, I now have an
Kevin> opinion...

Kevin> VAX is, historically, a very important target.  For a long time (and
Kevin> perhaps even now), it was considered to be the canonical example of a
Kevin> CISC architecture.

Kevin> Support still exists in GCC, though it's not clear to me if it
Kevin> actually works.  I saw recent patches (from April, 2019) from someone
Kevin> who is trying to get VAX support in GCC to work again.

Kevin> Given that GCC still has code which provides VAX architecture
Kevin> support, I think that GDB should do likewise.  I think that we
Kevin> should attempt to not break it (anymore than it's already broken?),
Kevin> but I also don't think we should attempt to test it beyond making
Kevin> sure that it still builds.

This all sounds reasonable to me.  Having a hook like this stay around
isn't so very expensive.  Thanks for looking into this.

Tom
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Fix stepping bug associated with non-contiguous blocks

Kevin Buettner
In reply to this post by Kevin Buettner
On Sat, 13 Jul 2019 17:01:08 -0700
Kevin Buettner <[hidden email]> wrote:

> gdb/ChangeLog:
>
> * infrun.c (fill_in_stop_func): Use find_pc_partial_function
> instead of find_function_entry_range_from_pc.
>
> testsuite/ChangeLog:
>
> * gdb.dwarf2/dw2-ranges-func.exp (enable_foo_cold_stepping):
> Enable tests associated with this flag.  Adjust regex
> referencing "foo_low" to now refer to "foo_cold" instead.

Pushed.

Kevin