[PATCH 0/3] Fix crash if connection drops in scoped_restore_current_thread's ctor

classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[PATCH 0/3] Fix crash if connection drops in scoped_restore_current_thread's ctor

Pedro Alves-2
(I have internet again: found a sim card of a different operator that
works.  This will do until the communications tower near me is
repaired and get I fiber back...)

This series fixes the crashes exposed by the
gdb.multi/multi-target.exp testcase when run against an Asan-enabled
GDB build, initially reported by Simon here:

  https://sourceware.org/pipermail/gdb-patches/2020-July/170222.html

The first two patches fix the crashes, and we should probably put them
in GDB 10.

The last patch is a follow up that avoids swallowing exceptions in
scoped_restore_current_thread's dtor that I'm thinking would be a bit
too invasive to put in GDB 10, I think it could do with a longer
baking period in master.

Pedro Alves (3):
  Fix crash if connection drops in scoped_restore_current_thread's ctor,
    part 1
  Fix crash if connection drops in scoped_restore_current_thread's ctor,
    part 2
  Make scoped_restore_current_thread's cdtors exception free (RFC)

 gdb/blockframe.c            |  6 +---
 gdb/dwarf2/frame-tailcall.c | 18 +++++++++--
 gdb/frame.c                 | 73 ++++++++++++++++++++++++++++++-------------
 gdb/frame.h                 | 22 ++++++++++---
 gdb/gdbthread.h             |  4 +++
 gdb/stack.c                 |  9 +++---
 gdb/thread.c                | 76 ++++++++++++++++-----------------------------
 gdb/value.c                 | 13 +++++++-
 8 files changed, 132 insertions(+), 89 deletions(-)


base-commit: ad8464f799a4c96c7ab8bdfec3f95846cf54f9b0
prerequisite-patch-id: 32ffdda7d7d774bc4df88bf848bcb796559b53ce
prerequisite-patch-id: 02021b74355b70debd344a6e445285c67dfef7d6
prerequisite-patch-id: c87fcf5a54f6805967cbf8ab107606c57d9ecf52
prerequisite-patch-id: ac7dee583d0ffa519c9d1cd89d27664bca68d8c1
prerequisite-patch-id: eac59ae2ea85d2d51e5be1b03e88a5641cc12c22
prerequisite-patch-id: 13da42ad04dc8e2e3bd6a556a0be0e17cf23669b
prerequisite-patch-id: fd3f09fdb58ddc1c595ea014716851f4c8fca48c
--
2.14.5

Reply | Threaded
Open this post in threaded view
|

[PATCH 1/3] Fix crash if connection drops in scoped_restore_current_thread's ctor, part 1

Pedro Alves-2
Running the testsuite against an Asan-enabled build of GDB makes
gdb.base/multi-target.exp expose this bug.

scoped_restore_current_thread's ctor calls get_frame_id to record the
selected frame's ID to restore later.  If the frame ID hasn't been
computed yet, it will be computed on the spot, and that will usually
require accessing the target's memory and registers, which requires
remote accesses.  If the remote connection closes while we're
computing the frame ID, the remote target exits its inferiors,
unpushes itself, and throws a TARGET_CLOSE_ERROR error.

If that happens, GDB can currently crash, here:

> ==18555==ERROR: AddressSanitizer: heap-use-after-free on address 0x621004670aa8 at pc 0x0000007ab125 bp 0x7ffdecaecd20 sp 0x7ffdecaecd10
> READ of size 4 at 0x621004670aa8 thread T0
>     #0 0x7ab124 in dwarf2_frame_this_id src/binutils-gdb/gdb/dwarf2/frame.c:1228
>     #1 0x983ec5 in compute_frame_id src/binutils-gdb/gdb/frame.c:550
>     #2 0x9841ee in get_frame_id(frame_info*) src/binutils-gdb/gdb/frame.c:582
>     #3 0x1093faa in scoped_restore_current_thread::scoped_restore_current_thread() src/binutils-gdb/gdb/thread.c:1462
>     #4 0xaee5ba in fetch_inferior_event(void*) src/binutils-gdb/gdb/infrun.c:3968
>     #5 0xaa990b in inferior_event_handler(inferior_event_type, void*) src/binutils-gdb/gdb/inf-loop.c:43
>     #6 0xea61b6 in remote_async_serial_handler src/binutils-gdb/gdb/remote.c:14161
>     #7 0xefca8a in run_async_handler_and_reschedule src/binutils-gdb/gdb/ser-base.c:137
>     #8 0xefcd23 in fd_event src/binutils-gdb/gdb/ser-base.c:188
>     #9 0x15a7416 in handle_file_event src/binutils-gdb/gdbsupport/event-loop.cc:548
>     #10 0x15a7c36 in gdb_wait_for_event src/binutils-gdb/gdbsupport/event-loop.cc:673
>     #11 0x15a5dbb in gdb_do_one_event() src/binutils-gdb/gdbsupport/event-loop.cc:215
>     #12 0xbfe62d in start_event_loop src/binutils-gdb/gdb/main.c:356
>     #13 0xbfe935 in captured_command_loop src/binutils-gdb/gdb/main.c:416
>     #14 0xc01d39 in captured_main src/binutils-gdb/gdb/main.c:1253
>     #15 0xc01dc9 in gdb_main(captured_main_args*) src/binutils-gdb/gdb/main.c:1268
>     #16 0x414ddd in main src/binutils-gdb/gdb/gdb.c:32
>     #17 0x7f590110b82f in __libc_start_main ../csu/libc-start.c:291
>     #18 0x414bd8 in _start (build/binutils-gdb/gdb/gdb+0x414bd8)

What happens is that above, we're in dwarf2_frame_this_id, just after
the dwarf2_frame_cache call.  The "cache" variable that the
dwarf2_frame_cache function returned is already stale.  It's been
released here, from within the dwarf2_frame_cache:

(top-gdb) bt
#0  reinit_frame_cache () at src/gdb/frame.c:1855
#1  0x00000000014ff7b0 in switch_to_no_thread () at src/gdb/thread.c:1301
#2  0x0000000000f66d3e in switch_to_inferior_no_thread (inf=0x615000338180) at src/gdb/inferior.c:626
#3  0x00000000012f3826 in remote_unpush_target (target=0x6170000c5900) at src/gdb/remote.c:5521
#4  0x00000000013097e0 in remote_target::readchar (this=0x6170000c5900, timeout=2) at src/gdb/remote.c:9137
#5  0x000000000130be4d in remote_target::getpkt_or_notif_sane_1 (this=0x6170000c5900, buf=0x6170000c5918, forever=0, expecting_notif=0, is_notif=0x0) at src/gdb/remote.c:9683
#6  0x000000000130c8ab in remote_target::getpkt_sane (this=0x6170000c5900, buf=0x6170000c5918, forever=0) at src/gdb/remote.c:9790
#7  0x000000000130bc0d in remote_target::getpkt (this=0x6170000c5900, buf=0x6170000c5918, forever=0) at src/gdb/remote.c:9623
#8  0x000000000130838e in remote_target::remote_read_bytes_1 (this=0x6170000c5900, memaddr=0x7fffffffcdc0, myaddr=0x6080000ad3bc "", len_units=64, unit_size=1, xfered_len_units=0x7fff6a29b9a0) at src/gdb/remote.c:8860
#9  0x0000000001308bd2 in remote_target::remote_read_bytes (this=0x6170000c5900, memaddr=0x7fffffffcdc0, myaddr=0x6080000ad3bc "", len=64, unit_size=1, xfered_len=0x7fff6a29b9a0) at src/gdb/remote.c:8987
#10 0x0000000001311ed1 in remote_target::xfer_partial (this=0x6170000c5900, object=TARGET_OBJECT_MEMORY, annex=0x0, readbuf=0x6080000ad3bc "", writebuf=0x0, offset=140737488342464, len=64, xfered_len=0x7fff6a29b9a0) at src/gdb/remote.c:10988
#11 0x00000000014ba969 in raw_memory_xfer_partial (ops=0x6170000c5900, readbuf=0x6080000ad3bc "", writebuf=0x0, memaddr=140737488342464, len=64, xfered_len=0x7fff6a29b9a0) at src/gdb/target.c:918
#12 0x00000000014bb720 in target_xfer_partial (ops=0x6170000c5900, object=TARGET_OBJECT_RAW_MEMORY, annex=0x0, readbuf=0x6080000ad3bc "", writebuf=0x0, offset=140737488342464, len=64, xfered_len=0x7fff6a29b9a0) at src/gdb/target.c:1148
#13 0x00000000014bc3b5 in target_read_partial (ops=0x6170000c5900, object=TARGET_OBJECT_RAW_MEMORY, annex=0x0, buf=0x6080000ad3bc "", offset=140737488342464, len=64, xfered_len=0x7fff6a29b9a0) at src/gdb/target.c:1380
#14 0x00000000014bc593 in target_read (ops=0x6170000c5900, object=TARGET_OBJECT_RAW_MEMORY, annex=0x0, buf=0x6080000ad3bc "", offset=140737488342464, len=64) at src/gdb/target.c:1419
#15 0x00000000014bbd4d in target_read_raw_memory (memaddr=0x7fffffffcdc0, myaddr=0x6080000ad3bc "", len=64) at src/gdb/target.c:1252
#16 0x0000000000bf27df in dcache_read_line (dcache=0x6060001eddc0, db=0x6080000ad3a0) at src/gdb/dcache.c:336
#17 0x0000000000bf2b72 in dcache_peek_byte (dcache=0x6060001eddc0, addr=0x7fffffffcdd8, ptr=0x6020001231b0 "") at src/gdb/dcache.c:403
#18 0x0000000000bf3103 in dcache_read_memory_partial (ops=0x6170000c5900, dcache=0x6060001eddc0, memaddr=0x7fffffffcdd8, myaddr=0x6020001231b0 "", len=8, xfered_len=0x7fff6a29bf20) at src/gdb/dcache.c:484
#19 0x00000000014bafe9 in memory_xfer_partial_1 (ops=0x6170000c5900, object=TARGET_OBJECT_STACK_MEMORY, readbuf=0x6020001231b0 "", writebuf=0x0, memaddr=140737488342488, len=8, xfered_len=0x7fff6a29bf20) at src/gdb/target.c:1034
#20 0x00000000014bb212 in memory_xfer_partial (ops=0x6170000c5900, object=TARGET_OBJECT_STACK_MEMORY, readbuf=0x6020001231b0 "", writebuf=0x0, memaddr=140737488342488, len=8, xfered_len=0x7fff6a29bf20) at src/gdb/target.c:1076
#21 0x00000000014bb6b3 in target_xfer_partial (ops=0x6170000c5900, object=TARGET_OBJECT_STACK_MEMORY, annex=0x0, readbuf=0x6020001231b0 "", writebuf=0x0, offset=140737488342488, len=8, xfered_len=0x7fff6a29bf20) at src/gdb/target.c:1133
#22 0x000000000164564d in read_value_memory (val=0x60f000029440, bit_offset=0, stack=1, memaddr=0x7fffffffcdd8, buffer=0x6020001231b0 "", length=8) at src/gdb/valops.c:956
#23 0x0000000001680fff in value_fetch_lazy_memory (val=0x60f000029440) at src/gdb/value.c:3764
#24 0x0000000001681efd in value_fetch_lazy (val=0x60f000029440) at src/gdb/value.c:3910
#25 0x0000000001676143 in value_optimized_out (value=0x60f000029440) at src/gdb/value.c:1411
#26 0x0000000000e0fcb8 in frame_register_unwind (next_frame=0x6210066bfde0, regnum=16, optimizedp=0x7fff6a29c200, unavailablep=0x7fff6a29c240, lvalp=0x7fff6a29c2c0, addrp=0x7fff6a29c300, realnump=0x7fff6a29c280, bufferp=0x7fff6a29c3a0 "@\304)j\377\177") at src/gdb/frame.c:1144
#27 0x0000000000e10418 in frame_unwind_register (next_frame=0x6210066bfde0, regnum=16, buf=0x7fff6a29c3a0 "@\304)j\377\177") at src/gdb/frame.c:1196
#28 0x0000000000f00431 in i386_unwind_pc (gdbarch=0x6210043d0110, next_frame=0x6210066bfde0) at src/gdb/i386-tdep.c:1969
#29 0x0000000000e39724 in gdbarch_unwind_pc (gdbarch=0x6210043d0110, next_frame=0x6210066bfde0) at src/gdb/gdbarch.c:3056
#30 0x0000000000c2ea90 in dwarf2_tailcall_sniffer_first (this_frame=0x6210066bfde0, tailcall_cachep=0x6210066bfee0, entry_cfa_sp_offsetp=0x0) at src/gdb/dwarf2/frame-tailcall.c:423
#31 0x0000000000c36bdb in dwarf2_frame_cache (this_frame=0x6210066bfde0, this_cache=0x6210066bfdf8) at src/gdb/dwarf2/frame.c:1198
#32 0x0000000000c36eb3 in dwarf2_frame_this_id (this_frame=0x6210066bfde0, this_cache=0x6210066bfdf8, this_id=0x6210066bfe40) at src/gdb/dwarf2/frame.c:1226

Note that remote_target::readchar in frame #3 throws
TARGET_CLOSE_ERROR after that remote_unpush_target in frame #3
returns.

The problem is that that TARGET_CLOSE_ERROR is swallowed by
value_optimized_out in frame #25.

If we fix that one, then we run into dwarf2_tailcall_sniffer_first
swallowing the exception in frame #30 too.

The attached patch fixes it by making those spots swallow fewer kinds
of errors.

gdb/ChangeLog:

        * frame-tailcall.c (dwarf2_tailcall_sniffer_first): Only swallow
        NO_ENTRY_VALUE_ERROR.
        * value.c (value_optimized_out): Only swallow OPTIMIZED_OUT_ERROR.
---
 gdb/dwarf2/frame-tailcall.c | 18 ++++++++++++++++--
 gdb/value.c                 | 13 ++++++++++++-
 2 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/gdb/dwarf2/frame-tailcall.c b/gdb/dwarf2/frame-tailcall.c
index 16dba2b201..ec6ed6bb00 100644
--- a/gdb/dwarf2/frame-tailcall.c
+++ b/gdb/dwarf2/frame-tailcall.c
@@ -377,7 +377,6 @@ dwarf2_tailcall_sniffer_first (struct frame_info *this_frame,
      get_frame_address_in_block will decrease it by 1 in such case.  */
   this_pc = get_frame_address_in_block (this_frame);
 
-  /* Catch any unwinding errors.  */
   try
     {
       int sp_regnum;
@@ -439,7 +438,22 @@ dwarf2_tailcall_sniffer_first (struct frame_info *this_frame,
     {
       if (entry_values_debug)
  exception_print (gdb_stdout, except);
-      return;
+
+      switch (except.error)
+ {
+ case NO_ENTRY_VALUE_ERROR:
+  /* Thrown by call_site_find_chain.  */
+ case MEMORY_ERROR:
+ case OPTIMIZED_OUT_ERROR:
+ case NOT_AVAILABLE_ERROR:
+  /* These can normally happen when we try to access an
+     optimized out or unavailable register, either in a
+     physical register or spilled to memory.  */
+  return;
+ }
+
+      /* Let unexpected errors propagate.  */
+      throw;
     }
 
   /* Ambiguous unwind or unambiguous unwind verified as matching.  */
diff --git a/gdb/value.c b/gdb/value.c
index 97a099ddbd..00d8ded2ae 100644
--- a/gdb/value.c
+++ b/gdb/value.c
@@ -1412,7 +1412,18 @@ value_optimized_out (struct value *value)
  }
       catch (const gdb_exception_error &ex)
  {
-  /* Fall back to checking value->optimized_out.  */
+  switch (ex.error)
+    {
+    case MEMORY_ERROR:
+    case OPTIMIZED_OUT_ERROR:
+    case NOT_AVAILABLE_ERROR:
+      /* These can normally happen when we try to access an
+ optimized out or unavailable register, either in a
+ physical register or spilled to memory.  */
+      break;
+    default:
+      throw;
+    }
  }
     }
 
--
2.14.5

Reply | Threaded
Open this post in threaded view
|

[PATCH 2/3] Fix crash if connection drops in scoped_restore_current_thread's ctor, part 2

Pedro Alves-2
In reply to this post by Pedro Alves-2
Running the testsuite against an Asan-enabled build of GDB makes
gdb.base/multi-target.exp expose this bug.

scoped_restore_current_thread's ctor calls get_frame_id to record the
selected frame's ID to restore later.  If the frame ID hasn't been
computed yet, it will be computed on the spot, and that will usually
require accessing the target's memory and registers.  If the remote
connection closes, while we're computing the frame ID, the remote
target exits its inferiors, unpushes itself, and throws a
TARGET_CLOSE_ERROR error.  Exiting the inferiors deletes the
inferior's threads.

scoped_restore_current_thread increments the current thread's refcount
to prevent the thread from being deleted from under its feet.
However, the code that does that isn't considering the case of the
thread being deleted from within get_frame_id.  It only increments the
refcount _after_ get_frame_id returns.  So if the current thread is
indeed deleted, the

     tp->incref ();

statement references a stale TP pointer.

Incrementing the refcounts earlier fixes it.

We should probably also let the TARGET_CLOSE_ERROR error propagate in
this case.  That alone would fix it, though it seems better to tweak
the refcount handling too.

gdb/ChangeLog:

        * thread.c
        (scoped_restore_current_thread::scoped_restore_current_thread):
        Incref the thread before calling get_frame_id instead of after.
        Let TARGET_CLOSE_ERROR propagate.
---
 gdb/thread.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/gdb/thread.c b/gdb/thread.c
index f0722d3588..1ec047e35b 100644
--- a/gdb/thread.c
+++ b/gdb/thread.c
@@ -1433,15 +1433,17 @@ scoped_restore_current_thread::~scoped_restore_current_thread ()
 
 scoped_restore_current_thread::scoped_restore_current_thread ()
 {
-  m_thread = NULL;
   m_inf = current_inferior ();
+  m_inf->incref ();
 
   if (inferior_ptid != null_ptid)
     {
-      thread_info *tp = inferior_thread ();
+      m_thread = inferior_thread ();
+      m_thread->incref ();
+
       struct frame_info *frame;
 
-      m_was_stopped = tp->state == THREAD_STOPPED;
+      m_was_stopped = m_thread->state == THREAD_STOPPED;
       if (m_was_stopped
   && target_has_registers
   && target_has_stack
@@ -1466,13 +1468,14 @@ scoped_restore_current_thread::scoped_restore_current_thread ()
  {
   m_selected_frame_id = null_frame_id;
   m_selected_frame_level = -1;
- }
 
-      tp->incref ();
-      m_thread = tp;
+  /* Better let this propagate.  */
+  if (ex.error == TARGET_CLOSE_ERROR)
+    throw;
+ }
     }
-
-  m_inf->incref ();
+  else
+    m_thread = NULL;
 }
 
 /* See gdbthread.h.  */
--
2.14.5

Reply | Threaded
Open this post in threaded view
|

[PATCH 3/3] Make scoped_restore_current_thread's cdtors exception free (RFC)

Pedro Alves-2
In reply to this post by Pedro Alves-2
If the remote target closes while we're reading registers/memory for
restoring the selected frame in scoped_restore_current_thread's dtor,
the corresponding TARGET_CLOSE_ERROR error is swallowed by the
scoped_restore_current_thread's dtor, because letting exceptions
escape from a dtor is bad.  It isn't great to lose that errors like
that, though.  I've been thinking about how to avoid it, and I came up
with this patch.

The idea here is to make scoped_restore_current_thread's dtor do as
little as possible, to avoid any work that might throw in the first
place.  And to do that, instead of having the dtor call
restore_selected_frame, which re-finds the previously selected frame,
just record the frame_id/level of the desired selected frame, and have
get_selected_frame find the frame the next time it is called.  In
effect, this implements most of Cagney's suggestion, here:

  /* On demand, create the selected frame and then return it.  If the
     selected frame can not be created, this function prints then throws
     an error.  When MESSAGE is non-NULL, use it for the error message,
     otherwize use a generic error message.  */
  /* FIXME: cagney/2002-11-28: At present, when there is no selected
     frame, this function always returns the current (inner most) frame.
     It should instead, when a thread has previously had its frame
     selected (but not resumed) and the frame cache invalidated, find
     and then return that thread's previously selected frame.  */
  extern struct frame_info *get_selected_frame (const char *message);

The only thing missing to fully implement that would be to make
reinit_frame_cache just clear selected_frame instead of calling
select_frame(NULL), and the call select_frame(NULL) explicitly in the
places where we really wanted reinit_frame_cache to go back to the
current frame too.  That can done separately, though, I'm not
proposing to do that in this patch.

restore_selected_frame should really move from thread.c to frame.c,
but I didn't do that here, just to avoid churn in the patch while it
collects comments.  I will do that as a preparatory patch if people
agree with this approach.

Incidentally, this patch alone would fix the crashes fixed by the
previous patches in the series, because with this,
scoped_restore_current_thread's constructor doesn't throw either.
---
 gdb/blockframe.c |  6 +----
 gdb/frame.c      | 73 ++++++++++++++++++++++++++++++++++++++++----------------
 gdb/frame.h      | 22 +++++++++++++----
 gdb/gdbthread.h  |  4 ++++
 gdb/stack.c      |  9 ++++---
 gdb/thread.c     | 67 ++++++++++++++++-----------------------------------
 6 files changed, 98 insertions(+), 83 deletions(-)

diff --git a/gdb/blockframe.c b/gdb/blockframe.c
index 05c26bc2c2..706b6db92c 100644
--- a/gdb/blockframe.c
+++ b/gdb/blockframe.c
@@ -448,14 +448,10 @@ find_gnu_ifunc_target_type (CORE_ADDR resolver_funaddr)
 struct frame_info *
 block_innermost_frame (const struct block *block)
 {
-  struct frame_info *frame;
-
   if (block == NULL)
     return NULL;
 
-  frame = get_selected_frame_if_set ();
-  if (frame == NULL)
-    frame = get_current_frame ();
+  frame_info *frame = get_selected_frame (NULL);
   while (frame != NULL)
     {
       const struct block *frame_block = get_frame_block (frame, NULL);
diff --git a/gdb/frame.c b/gdb/frame.c
index ff27b9f00e..e7dffd204b 100644
--- a/gdb/frame.c
+++ b/gdb/frame.c
@@ -297,17 +297,15 @@ frame_stash_invalidate (void)
 /* See frame.h  */
 scoped_restore_selected_frame::scoped_restore_selected_frame ()
 {
-  m_fid = get_frame_id (get_selected_frame (NULL));
+  get_selected_frame_info (&m_fid, &m_level);
 }
 
 /* See frame.h  */
 scoped_restore_selected_frame::~scoped_restore_selected_frame ()
 {
-  frame_info *frame = frame_find_by_id (m_fid);
-  if (frame == NULL)
-    warning (_("Unable to restore previously selected frame."));
-  else
-    select_frame (frame);
+  /* Use the lazy variant because we don't want to do any work here
+     that might throw, since we're in a dtor.  */
+  select_frame_lazy (m_fid, m_level);
 }
 
 /* Flag to control debugging.  */
@@ -1641,9 +1639,24 @@ get_current_frame (void)
 }
 
 /* The "selected" stack frame is used by default for local and arg
-   access.  May be zero, for no selected frame.  */
-
+   access.  */
+
+/* If SELECTED_FRAME is NULL, then the selected frame is found in
+   SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL.  Those are used by
+   get_selected_frame to re-find the selected frame.  If
+   SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL are null_ptid / -1, there
+   is no selected frame, in which case the next time
+   get_selected_frame is called, it selects the current frame.  */
 static struct frame_info *selected_frame;
+static frame_id selected_frame_id = null_frame_id;
+static int selected_frame_level = -1;
+
+void
+get_selected_frame_info (frame_id *frame_id, int *level)
+{
+  *frame_id = selected_frame_id;
+  *level = selected_frame_level;
+}
 
 int
 has_stack_frames (void)
@@ -1671,6 +1684,8 @@ has_stack_frames (void)
   return 1;
 }
 
+void restore_selected_frame (struct frame_id a_frame_id, int frame_level);
+
 /* Return the selected frame.  Always non-NULL (unless there isn't an
    inferior sufficient for creating a frame) in which case an error is
    thrown.  */
@@ -1682,24 +1697,14 @@ get_selected_frame (const char *message)
     {
       if (message != NULL && !has_stack_frames ())
  error (("%s"), message);
-      /* Hey!  Don't trust this.  It should really be re-finding the
- last selected frame of the currently selected thread.  This,
- though, is better than nothing.  */
-      select_frame (get_current_frame ());
+
+      restore_selected_frame (selected_frame_id, selected_frame_level);
     }
   /* There is always a frame.  */
   gdb_assert (selected_frame != NULL);
   return selected_frame;
 }
 
-/* If there is a selected frame, return it.  Otherwise, return NULL.  */
-
-struct frame_info *
-get_selected_frame_if_set (void)
-{
-  return selected_frame;
-}
-
 /* This is a variant of get_selected_frame() which can be called when
    the inferior does not have a frame; in that case it will return
    NULL instead of calling error().  */
@@ -1712,12 +1717,26 @@ deprecated_safe_get_selected_frame (void)
   return get_selected_frame (NULL);
 }
 
-/* Select frame FI (or NULL - to invalidate the current frame).  */
+/* Select frame FI (or NULL - to invalidate the selected frame).  */
 
 void
 select_frame (struct frame_info *fi)
 {
   selected_frame = fi;
+  selected_frame_level = frame_relative_level (fi);
+  if (selected_frame_level == 0)
+    {
+      /* Treat the current frame especially -- we want to always
+ save/restore it without warning, even if the frame ID changes
+ (see restore_selected_frame).  Also get_frame_id may access
+ the target's registers/memory, and thus skipping get_frame_id
+ optimizes the common case.  */
+      selected_frame_level = -1;
+      selected_frame_id = null_frame_id;
+    }
+  else
+    selected_frame_id = get_frame_id (fi);
+
   /* NOTE: cagney/2002-05-04: FI can be NULL.  This occurs when the
      frame is being invalidated.  */
 
@@ -1756,6 +1775,18 @@ select_frame (struct frame_info *fi)
     }
 }
 
+void
+select_frame_lazy (frame_id a_frame_id, int frame_level)
+{
+  /* get_selected_frame_info never returns level == 0, so we shouldn't
+     see it here either.  */
+  gdb_assert (frame_level != 0);
+
+  selected_frame = nullptr;
+  selected_frame_id = a_frame_id;
+  selected_frame_level = frame_level;
+}
+
 /* Create an arbitrary (i.e. address specified by user) or innermost frame.
    Always returns a non-NULL value.  */
 
diff --git a/gdb/frame.h b/gdb/frame.h
index e835d49f9c..48222c69d3 100644
--- a/gdb/frame.h
+++ b/gdb/frame.h
@@ -181,8 +181,9 @@ class scoped_restore_selected_frame
 
 private:
 
-  /* The ID of the previously selected frame.  */
+  /* The ID and level of the previously selected frame.  */
   struct frame_id m_fid;
+  int m_level;
 };
 
 /* Methods for constructing and comparing Frame IDs.  */
@@ -329,13 +330,24 @@ extern void reinit_frame_cache (void);
    and then return that thread's previously selected frame.  */
 extern struct frame_info *get_selected_frame (const char *message);
 
-/* If there is a selected frame, return it.  Otherwise, return NULL.  */
-extern struct frame_info *get_selected_frame_if_set (void);
+/* Return frame ID and frame level of the selected frame.  If there's
+   no selected frame or the selected frame is the current frame,
+   return null_frame_id/-1.  This is preferred over getting the same
+   info out of get_selected_frame directly because this function does
+   not create the selected-frame's frame_info object if it hasn't been
+   created yet.  */
+extern void get_selected_frame_info (frame_id *frame_id, int *level);
 
-/* Select a specific frame.  NULL, apparently implies re-select the
-   inner most frame.  */
+/* Select a specific frame.  NULL implies re-select the inner most
+   frame.  */
 extern void select_frame (struct frame_info *);
 
+/* Setup frame A_FRAME_ID, with level FRAME_LEVEL as the selected
+   frame, but don't actually try to find it right now.  The next call
+   to get_selected_frame will look it up (and cache the result).
+   null_frame_id/-1 implies re-select the inner most frame.  */
+extern void select_frame_lazy (frame_id a_frame_id, int frame_level);
+
 /* Given a FRAME, return the next (more inner, younger) or previous
    (more outer, older) frame.  */
 extern struct frame_info *get_prev_frame (struct frame_info *);
diff --git a/gdb/gdbthread.h b/gdb/gdbthread.h
index 0166b2000f..edfdf98b3d 100644
--- a/gdb/gdbthread.h
+++ b/gdb/gdbthread.h
@@ -667,6 +667,10 @@ class scoped_restore_current_thread
   frame_id m_selected_frame_id;
   int m_selected_frame_level;
   bool m_was_stopped;
+  /* Save/restore the language as well, because selecting a frame
+     changes the current language to the frame's language if "set
+     language auto".  */
+  enum language m_lang;
 };
 
 /* Returns a pointer into the thread_info corresponding to
diff --git a/gdb/stack.c b/gdb/stack.c
index 265e764dc2..93de451a12 100644
--- a/gdb/stack.c
+++ b/gdb/stack.c
@@ -1836,9 +1836,9 @@ trailing_outermost_frame (int count)
 static void
 select_frame_command_core (struct frame_info *fi, bool ignored)
 {
-  struct frame_info *prev_frame = get_selected_frame_if_set ();
+  frame_info *prev_frame = get_selected_frame (NULL);
   select_frame (fi);
-  if (get_selected_frame_if_set () != prev_frame)
+  if (get_selected_frame (NULL) != prev_frame)
     gdb::observers::user_selected_context_changed.notify (USER_SELECTED_FRAME);
 }
 
@@ -1857,10 +1857,9 @@ select_frame_for_mi (struct frame_info *fi)
 static void
 frame_command_core (struct frame_info *fi, bool ignored)
 {
-  struct frame_info *prev_frame = get_selected_frame_if_set ();
-
+  frame_info *prev_frame = get_selected_frame (nullptr);
   select_frame (fi);
-  if (get_selected_frame_if_set () != prev_frame)
+  if (get_selected_frame (nullptr) != prev_frame)
     gdb::observers::user_selected_context_changed.notify (USER_SELECTED_FRAME);
   else
     print_selected_thread_frame (current_uiout, USER_SELECTED_FRAME);
diff --git a/gdb/thread.c b/gdb/thread.c
index 1ec047e35b..8a5634eae3 100644
--- a/gdb/thread.c
+++ b/gdb/thread.c
@@ -1325,20 +1325,25 @@ switch_to_thread (process_stratum_target *proc_target, ptid_t ptid)
   switch_to_thread (thr);
 }
 
-static void
+void restore_selected_frame (struct frame_id a_frame_id, int frame_level);
+
+void
 restore_selected_frame (struct frame_id a_frame_id, int frame_level)
 {
   struct frame_info *frame = NULL;
   int count;
 
-  /* This means there was no selected frame.  */
+  /* This either means there was no selected frame, or the selected
+     frame was the inner most (the current frame).  */
   if (frame_level == -1)
     {
-      select_frame (NULL);
+      select_frame (get_current_frame ());
       return;
     }
 
-  gdb_assert (frame_level >= 0);
+  /* select_frame never saves 0 in selected_frame_level, so we
+     shouldn't see it here.  */
+  gdb_assert (frame_level > 0);
 
   /* Restore by level first, check if the frame id is the same as
      expected.  If that fails, try restoring by frame id.  If that
@@ -1408,23 +1413,19 @@ scoped_restore_current_thread::restore ()
       && target_has_registers
       && target_has_stack
       && target_has_memory)
-    restore_selected_frame (m_selected_frame_id, m_selected_frame_level);
+    {
+      /* Use the lazy variant because we don't want to do any work
+ here that might throw, since we're in a dtor.  */
+      select_frame_lazy (m_selected_frame_id, m_selected_frame_level);
+    }
+
+  set_language (m_lang);
 }
 
 scoped_restore_current_thread::~scoped_restore_current_thread ()
 {
   if (!m_dont_restore)
-    {
-      try
- {
-  restore ();
- }
-      catch (const gdb_exception &ex)
- {
-  /* We're in a dtor, there's really nothing else we can do
-     but swallow the exception.  */
- }
-    }
+    restore ();
 
   if (m_thread != NULL)
     m_thread->decref ();
@@ -1436,43 +1437,15 @@ scoped_restore_current_thread::scoped_restore_current_thread ()
   m_inf = current_inferior ();
   m_inf->incref ();
 
+  m_lang = current_language->la_language;
+
   if (inferior_ptid != null_ptid)
     {
       m_thread = inferior_thread ();
       m_thread->incref ();
 
-      struct frame_info *frame;
-
       m_was_stopped = m_thread->state == THREAD_STOPPED;
-      if (m_was_stopped
-  && target_has_registers
-  && target_has_stack
-  && target_has_memory)
- {
-  /* When processing internal events, there might not be a
-     selected frame.  If we naively call get_selected_frame
-     here, then we can end up reading debuginfo for the
-     current frame, but we don't generally need the debuginfo
-     at this point.  */
-  frame = get_selected_frame_if_set ();
- }
-      else
- frame = NULL;
-
-      try
- {
-  m_selected_frame_id = get_frame_id (frame);
-  m_selected_frame_level = frame_relative_level (frame);
- }
-      catch (const gdb_exception_error &ex)
- {
-  m_selected_frame_id = null_frame_id;
-  m_selected_frame_level = -1;
-
-  /* Better let this propagate.  */
-  if (ex.error == TARGET_CLOSE_ERROR)
-    throw;
- }
+      get_selected_frame_info (&m_selected_frame_id, &m_selected_frame_level);
     }
   else
     m_thread = NULL;
--
2.14.5

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 1/3] Fix crash if connection drops in scoped_restore_current_thread's ctor, part 1

Simon Marchi-4
In reply to this post by Pedro Alves-2
On 2020-07-08 7:31 p.m., Pedro Alves wrote:

> Running the testsuite against an Asan-enabled build of GDB makes
> gdb.base/multi-target.exp expose this bug.
>
> scoped_restore_current_thread's ctor calls get_frame_id to record the
> selected frame's ID to restore later.  If the frame ID hasn't been
> computed yet, it will be computed on the spot, and that will usually
> require accessing the target's memory and registers, which requires
> remote accesses.  If the remote connection closes while we're
> computing the frame ID, the remote target exits its inferiors,
> unpushes itself, and throws a TARGET_CLOSE_ERROR error.
>
> If that happens, GDB can currently crash, here:
>
>> ==18555==ERROR: AddressSanitizer: heap-use-after-free on address 0x621004670aa8 at pc 0x0000007ab125 bp 0x7ffdecaecd20 sp 0x7ffdecaecd10
>> READ of size 4 at 0x621004670aa8 thread T0
>>     #0 0x7ab124 in dwarf2_frame_this_id src/binutils-gdb/gdb/dwarf2/frame.c:1228
>>     #1 0x983ec5 in compute_frame_id src/binutils-gdb/gdb/frame.c:550
>>     #2 0x9841ee in get_frame_id(frame_info*) src/binutils-gdb/gdb/frame.c:582
>>     #3 0x1093faa in scoped_restore_current_thread::scoped_restore_current_thread() src/binutils-gdb/gdb/thread.c:1462
>>     #4 0xaee5ba in fetch_inferior_event(void*) src/binutils-gdb/gdb/infrun.c:3968
>>     #5 0xaa990b in inferior_event_handler(inferior_event_type, void*) src/binutils-gdb/gdb/inf-loop.c:43
>>     #6 0xea61b6 in remote_async_serial_handler src/binutils-gdb/gdb/remote.c:14161
>>     #7 0xefca8a in run_async_handler_and_reschedule src/binutils-gdb/gdb/ser-base.c:137
>>     #8 0xefcd23 in fd_event src/binutils-gdb/gdb/ser-base.c:188
>>     #9 0x15a7416 in handle_file_event src/binutils-gdb/gdbsupport/event-loop.cc:548
>>     #10 0x15a7c36 in gdb_wait_for_event src/binutils-gdb/gdbsupport/event-loop.cc:673
>>     #11 0x15a5dbb in gdb_do_one_event() src/binutils-gdb/gdbsupport/event-loop.cc:215
>>     #12 0xbfe62d in start_event_loop src/binutils-gdb/gdb/main.c:356
>>     #13 0xbfe935 in captured_command_loop src/binutils-gdb/gdb/main.c:416
>>     #14 0xc01d39 in captured_main src/binutils-gdb/gdb/main.c:1253
>>     #15 0xc01dc9 in gdb_main(captured_main_args*) src/binutils-gdb/gdb/main.c:1268
>>     #16 0x414ddd in main src/binutils-gdb/gdb/gdb.c:32
>>     #17 0x7f590110b82f in __libc_start_main ../csu/libc-start.c:291
>>     #18 0x414bd8 in _start (build/binutils-gdb/gdb/gdb+0x414bd8)
>
> What happens is that above, we're in dwarf2_frame_this_id, just after
> the dwarf2_frame_cache call.  The "cache" variable that the
> dwarf2_frame_cache function returned is already stale.  It's been
> released here, from within the dwarf2_frame_cache:
>
> (top-gdb) bt
> #0  reinit_frame_cache () at src/gdb/frame.c:1855
> #1  0x00000000014ff7b0 in switch_to_no_thread () at src/gdb/thread.c:1301
> #2  0x0000000000f66d3e in switch_to_inferior_no_thread (inf=0x615000338180) at src/gdb/inferior.c:626
> #3  0x00000000012f3826 in remote_unpush_target (target=0x6170000c5900) at src/gdb/remote.c:5521
> #4  0x00000000013097e0 in remote_target::readchar (this=0x6170000c5900, timeout=2) at src/gdb/remote.c:9137
> #5  0x000000000130be4d in remote_target::getpkt_or_notif_sane_1 (this=0x6170000c5900, buf=0x6170000c5918, forever=0, expecting_notif=0, is_notif=0x0) at src/gdb/remote.c:9683
> #6  0x000000000130c8ab in remote_target::getpkt_sane (this=0x6170000c5900, buf=0x6170000c5918, forever=0) at src/gdb/remote.c:9790
> #7  0x000000000130bc0d in remote_target::getpkt (this=0x6170000c5900, buf=0x6170000c5918, forever=0) at src/gdb/remote.c:9623
> #8  0x000000000130838e in remote_target::remote_read_bytes_1 (this=0x6170000c5900, memaddr=0x7fffffffcdc0, myaddr=0x6080000ad3bc "", len_units=64, unit_size=1, xfered_len_units=0x7fff6a29b9a0) at src/gdb/remote.c:8860
> #9  0x0000000001308bd2 in remote_target::remote_read_bytes (this=0x6170000c5900, memaddr=0x7fffffffcdc0, myaddr=0x6080000ad3bc "", len=64, unit_size=1, xfered_len=0x7fff6a29b9a0) at src/gdb/remote.c:8987
> #10 0x0000000001311ed1 in remote_target::xfer_partial (this=0x6170000c5900, object=TARGET_OBJECT_MEMORY, annex=0x0, readbuf=0x6080000ad3bc "", writebuf=0x0, offset=140737488342464, len=64, xfered_len=0x7fff6a29b9a0) at src/gdb/remote.c:10988
> #11 0x00000000014ba969 in raw_memory_xfer_partial (ops=0x6170000c5900, readbuf=0x6080000ad3bc "", writebuf=0x0, memaddr=140737488342464, len=64, xfered_len=0x7fff6a29b9a0) at src/gdb/target.c:918
> #12 0x00000000014bb720 in target_xfer_partial (ops=0x6170000c5900, object=TARGET_OBJECT_RAW_MEMORY, annex=0x0, readbuf=0x6080000ad3bc "", writebuf=0x0, offset=140737488342464, len=64, xfered_len=0x7fff6a29b9a0) at src/gdb/target.c:1148
> #13 0x00000000014bc3b5 in target_read_partial (ops=0x6170000c5900, object=TARGET_OBJECT_RAW_MEMORY, annex=0x0, buf=0x6080000ad3bc "", offset=140737488342464, len=64, xfered_len=0x7fff6a29b9a0) at src/gdb/target.c:1380
> #14 0x00000000014bc593 in target_read (ops=0x6170000c5900, object=TARGET_OBJECT_RAW_MEMORY, annex=0x0, buf=0x6080000ad3bc "", offset=140737488342464, len=64) at src/gdb/target.c:1419
> #15 0x00000000014bbd4d in target_read_raw_memory (memaddr=0x7fffffffcdc0, myaddr=0x6080000ad3bc "", len=64) at src/gdb/target.c:1252
> #16 0x0000000000bf27df in dcache_read_line (dcache=0x6060001eddc0, db=0x6080000ad3a0) at src/gdb/dcache.c:336
> #17 0x0000000000bf2b72 in dcache_peek_byte (dcache=0x6060001eddc0, addr=0x7fffffffcdd8, ptr=0x6020001231b0 "") at src/gdb/dcache.c:403
> #18 0x0000000000bf3103 in dcache_read_memory_partial (ops=0x6170000c5900, dcache=0x6060001eddc0, memaddr=0x7fffffffcdd8, myaddr=0x6020001231b0 "", len=8, xfered_len=0x7fff6a29bf20) at src/gdb/dcache.c:484
> #19 0x00000000014bafe9 in memory_xfer_partial_1 (ops=0x6170000c5900, object=TARGET_OBJECT_STACK_MEMORY, readbuf=0x6020001231b0 "", writebuf=0x0, memaddr=140737488342488, len=8, xfered_len=0x7fff6a29bf20) at src/gdb/target.c:1034
> #20 0x00000000014bb212 in memory_xfer_partial (ops=0x6170000c5900, object=TARGET_OBJECT_STACK_MEMORY, readbuf=0x6020001231b0 "", writebuf=0x0, memaddr=140737488342488, len=8, xfered_len=0x7fff6a29bf20) at src/gdb/target.c:1076
> #21 0x00000000014bb6b3 in target_xfer_partial (ops=0x6170000c5900, object=TARGET_OBJECT_STACK_MEMORY, annex=0x0, readbuf=0x6020001231b0 "", writebuf=0x0, offset=140737488342488, len=8, xfered_len=0x7fff6a29bf20) at src/gdb/target.c:1133
> #22 0x000000000164564d in read_value_memory (val=0x60f000029440, bit_offset=0, stack=1, memaddr=0x7fffffffcdd8, buffer=0x6020001231b0 "", length=8) at src/gdb/valops.c:956
> #23 0x0000000001680fff in value_fetch_lazy_memory (val=0x60f000029440) at src/gdb/value.c:3764
> #24 0x0000000001681efd in value_fetch_lazy (val=0x60f000029440) at src/gdb/value.c:3910
> #25 0x0000000001676143 in value_optimized_out (value=0x60f000029440) at src/gdb/value.c:1411
> #26 0x0000000000e0fcb8 in frame_register_unwind (next_frame=0x6210066bfde0, regnum=16, optimizedp=0x7fff6a29c200, unavailablep=0x7fff6a29c240, lvalp=0x7fff6a29c2c0, addrp=0x7fff6a29c300, realnump=0x7fff6a29c280, bufferp=0x7fff6a29c3a0 "@\304)j\377\177") at src/gdb/frame.c:1144
> #27 0x0000000000e10418 in frame_unwind_register (next_frame=0x6210066bfde0, regnum=16, buf=0x7fff6a29c3a0 "@\304)j\377\177") at src/gdb/frame.c:1196
> #28 0x0000000000f00431 in i386_unwind_pc (gdbarch=0x6210043d0110, next_frame=0x6210066bfde0) at src/gdb/i386-tdep.c:1969
> #29 0x0000000000e39724 in gdbarch_unwind_pc (gdbarch=0x6210043d0110, next_frame=0x6210066bfde0) at src/gdb/gdbarch.c:3056
> #30 0x0000000000c2ea90 in dwarf2_tailcall_sniffer_first (this_frame=0x6210066bfde0, tailcall_cachep=0x6210066bfee0, entry_cfa_sp_offsetp=0x0) at src/gdb/dwarf2/frame-tailcall.c:423
> #31 0x0000000000c36bdb in dwarf2_frame_cache (this_frame=0x6210066bfde0, this_cache=0x6210066bfdf8) at src/gdb/dwarf2/frame.c:1198
> #32 0x0000000000c36eb3 in dwarf2_frame_this_id (this_frame=0x6210066bfde0, this_cache=0x6210066bfdf8, this_id=0x6210066bfe40) at src/gdb/dwarf2/frame.c:1226
>
> Note that remote_target::readchar in frame #3 throws
> TARGET_CLOSE_ERROR after that remote_unpush_target in frame #3
> returns.
>
> The problem is that that TARGET_CLOSE_ERROR is swallowed by

`that that`

Otherwise, LGTM.

Simon
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 2/3] Fix crash if connection drops in scoped_restore_current_thread's ctor, part 2

Simon Marchi-4
In reply to this post by Pedro Alves-2
On 2020-07-08 7:31 p.m., Pedro Alves wrote:

> Running the testsuite against an Asan-enabled build of GDB makes
> gdb.base/multi-target.exp expose this bug.
>
> scoped_restore_current_thread's ctor calls get_frame_id to record the
> selected frame's ID to restore later.  If the frame ID hasn't been
> computed yet, it will be computed on the spot, and that will usually
> require accessing the target's memory and registers.  If the remote
> connection closes, while we're computing the frame ID, the remote
> target exits its inferiors, unpushes itself, and throws a
> TARGET_CLOSE_ERROR error.  Exiting the inferiors deletes the
> inferior's threads.
>
> scoped_restore_current_thread increments the current thread's refcount
> to prevent the thread from being deleted from under its feet.
> However, the code that does that isn't considering the case of the
> thread being deleted from within get_frame_id.  It only increments the
> refcount _after_ get_frame_id returns.  So if the current thread is
> indeed deleted, the
>
>      tp->incref ();
>
> statement references a stale TP pointer.
>
> Incrementing the refcounts earlier fixes it.
>
> We should probably also let the TARGET_CLOSE_ERROR error propagate in
> this case.  That alone would fix it, though it seems better to tweak
> the refcount handling too.

So, when the target closes while we (scoped_restore_current_thread) own
a reference on the inferior and thread, the inferior and thread are still
destroyed, and so we shouldn't decref them?

Simon
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 3/3] Make scoped_restore_current_thread's cdtors exception free (RFC)

Simon Marchi-4
In reply to this post by Pedro Alves-2
On 2020-07-08 7:31 p.m., Pedro Alves wrote:

> If the remote target closes while we're reading registers/memory for
> restoring the selected frame in scoped_restore_current_thread's dtor,
> the corresponding TARGET_CLOSE_ERROR error is swallowed by the
> scoped_restore_current_thread's dtor, because letting exceptions
> escape from a dtor is bad.  It isn't great to lose that errors like
> that, though.  I've been thinking about how to avoid it, and I came up
> with this patch.
>
> The idea here is to make scoped_restore_current_thread's dtor do as
> little as possible, to avoid any work that might throw in the first
> place.  And to do that, instead of having the dtor call
> restore_selected_frame, which re-finds the previously selected frame,
> just record the frame_id/level of the desired selected frame, and have
> get_selected_frame find the frame the next time it is called.  In
> effect, this implements most of Cagney's suggestion, here:
>
>   /* On demand, create the selected frame and then return it.  If the
>      selected frame can not be created, this function prints then throws
>      an error.  When MESSAGE is non-NULL, use it for the error message,
>      otherwize use a generic error message.  */
>   /* FIXME: cagney/2002-11-28: At present, when there is no selected
>      frame, this function always returns the current (inner most) frame.
>      It should instead, when a thread has previously had its frame
>      selected (but not resumed) and the frame cache invalidated, find
>      and then return that thread's previously selected frame.  */
>   extern struct frame_info *get_selected_frame (const char *message);
>
> The only thing missing to fully implement that would be to make
> reinit_frame_cache just clear selected_frame instead of calling
> select_frame(NULL), and the call select_frame(NULL) explicitly in the
> places where we really wanted reinit_frame_cache to go back to the
> current frame too.  That can done separately, though, I'm not
> proposing to do that in this patch.
>
> restore_selected_frame should really move from thread.c to frame.c,
> but I didn't do that here, just to avoid churn in the patch while it
> collects comments.  I will do that as a preparatory patch if people
> agree with this approach.
>
> Incidentally, this patch alone would fix the crashes fixed by the
> previous patches in the series, because with this,
> scoped_restore_current_thread's constructor doesn't throw either.

I don't understand all the code changes, the but the idea sounds fine
to me.

> -/* Select frame FI (or NULL - to invalidate the current frame).  */
> +/* Select frame FI (or NULL - to invalidate the selected frame).  */
>  
>  void
>  select_frame (struct frame_info *fi)
>  {
>    selected_frame = fi;
> +  selected_frame_level = frame_relative_level (fi);
> +  if (selected_frame_level == 0)
> +    {
> +      /* Treat the current frame especially -- we want to always
> + save/restore it without warning, even if the frame ID changes
> + (see restore_selected_frame).  Also get_frame_id may access
> + the target's registers/memory, and thus skipping get_frame_id
> + optimizes the common case.  */
> +      selected_frame_level = -1;
> +      selected_frame_id = null_frame_id;
> +    }
> +  else
> +    selected_frame_id = get_frame_id (fi);
> +

I don't really understand this part, why don't we want to set selected_frame_level
and selected_frame_id when the level is 0.  I'm more interested by why it wouldn't
be correct or how it would break things, rather than the optimization aspect.

Simon
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 1/3] Fix crash if connection drops in scoped_restore_current_thread's ctor, part 1

Pedro Alves-2
In reply to this post by Simon Marchi-4
On 7/9/20 4:17 AM, Simon Marchi wrote:

> On 2020-07-08 7:31 p.m., Pedro Alves wrote:
>> Running the testsuite against an Asan-enabled build of GDB makes
>> gdb.base/multi-target.exp expose this bug.
>>
>> scoped_restore_current_thread's ctor calls get_frame_id to record the
>> selected frame's ID to restore later.  If the frame ID hasn't been
>> computed yet, it will be computed on the spot, and that will usually
>> require accessing the target's memory and registers, which requires
>> remote accesses.  If the remote connection closes while we're
>> computing the frame ID, the remote target exits its inferiors,
>> unpushes itself, and throws a TARGET_CLOSE_ERROR error.
>>
>> If that happens, GDB can currently crash, here:
>>
>>> ==18555==ERROR: AddressSanitizer: heap-use-after-free on address 0x621004670aa8 at pc 0x0000007ab125 bp 0x7ffdecaecd20 sp 0x7ffdecaecd10
>>> READ of size 4 at 0x621004670aa8 thread T0
>>>     #0 0x7ab124 in dwarf2_frame_this_id src/binutils-gdb/gdb/dwarf2/frame.c:1228
>>>     #1 0x983ec5 in compute_frame_id src/binutils-gdb/gdb/frame.c:550
>>>     #2 0x9841ee in get_frame_id(frame_info*) src/binutils-gdb/gdb/frame.c:582
>>>     #3 0x1093faa in scoped_restore_current_thread::scoped_restore_current_thread() src/binutils-gdb/gdb/thread.c:1462
>>>     #4 0xaee5ba in fetch_inferior_event(void*) src/binutils-gdb/gdb/infrun.c:3968
>>>     #5 0xaa990b in inferior_event_handler(inferior_event_type, void*) src/binutils-gdb/gdb/inf-loop.c:43
>>>     #6 0xea61b6 in remote_async_serial_handler src/binutils-gdb/gdb/remote.c:14161
>>>     #7 0xefca8a in run_async_handler_and_reschedule src/binutils-gdb/gdb/ser-base.c:137
>>>     #8 0xefcd23 in fd_event src/binutils-gdb/gdb/ser-base.c:188
>>>     #9 0x15a7416 in handle_file_event src/binutils-gdb/gdbsupport/event-loop.cc:548
>>>     #10 0x15a7c36 in gdb_wait_for_event src/binutils-gdb/gdbsupport/event-loop.cc:673
>>>     #11 0x15a5dbb in gdb_do_one_event() src/binutils-gdb/gdbsupport/event-loop.cc:215
>>>     #12 0xbfe62d in start_event_loop src/binutils-gdb/gdb/main.c:356
>>>     #13 0xbfe935 in captured_command_loop src/binutils-gdb/gdb/main.c:416
>>>     #14 0xc01d39 in captured_main src/binutils-gdb/gdb/main.c:1253
>>>     #15 0xc01dc9 in gdb_main(captured_main_args*) src/binutils-gdb/gdb/main.c:1268
>>>     #16 0x414ddd in main src/binutils-gdb/gdb/gdb.c:32
>>>     #17 0x7f590110b82f in __libc_start_main ../csu/libc-start.c:291
>>>     #18 0x414bd8 in _start (build/binutils-gdb/gdb/gdb+0x414bd8)
>>
>> What happens is that above, we're in dwarf2_frame_this_id, just after
>> the dwarf2_frame_cache call.  The "cache" variable that the
>> dwarf2_frame_cache function returned is already stale.  It's been
>> released here, from within the dwarf2_frame_cache:
>>
>> (top-gdb) bt
>> #0  reinit_frame_cache () at src/gdb/frame.c:1855
>> #1  0x00000000014ff7b0 in switch_to_no_thread () at src/gdb/thread.c:1301
>> #2  0x0000000000f66d3e in switch_to_inferior_no_thread (inf=0x615000338180) at src/gdb/inferior.c:626
>> #3  0x00000000012f3826 in remote_unpush_target (target=0x6170000c5900) at src/gdb/remote.c:5521
>> #4  0x00000000013097e0 in remote_target::readchar (this=0x6170000c5900, timeout=2) at src/gdb/remote.c:9137
>> #5  0x000000000130be4d in remote_target::getpkt_or_notif_sane_1 (this=0x6170000c5900, buf=0x6170000c5918, forever=0, expecting_notif=0, is_notif=0x0) at src/gdb/remote.c:9683
>> #6  0x000000000130c8ab in remote_target::getpkt_sane (this=0x6170000c5900, buf=0x6170000c5918, forever=0) at src/gdb/remote.c:9790
>> #7  0x000000000130bc0d in remote_target::getpkt (this=0x6170000c5900, buf=0x6170000c5918, forever=0) at src/gdb/remote.c:9623
>> #8  0x000000000130838e in remote_target::remote_read_bytes_1 (this=0x6170000c5900, memaddr=0x7fffffffcdc0, myaddr=0x6080000ad3bc "", len_units=64, unit_size=1, xfered_len_units=0x7fff6a29b9a0) at src/gdb/remote.c:8860
>> #9  0x0000000001308bd2 in remote_target::remote_read_bytes (this=0x6170000c5900, memaddr=0x7fffffffcdc0, myaddr=0x6080000ad3bc "", len=64, unit_size=1, xfered_len=0x7fff6a29b9a0) at src/gdb/remote.c:8987
>> #10 0x0000000001311ed1 in remote_target::xfer_partial (this=0x6170000c5900, object=TARGET_OBJECT_MEMORY, annex=0x0, readbuf=0x6080000ad3bc "", writebuf=0x0, offset=140737488342464, len=64, xfered_len=0x7fff6a29b9a0) at src/gdb/remote.c:10988
>> #11 0x00000000014ba969 in raw_memory_xfer_partial (ops=0x6170000c5900, readbuf=0x6080000ad3bc "", writebuf=0x0, memaddr=140737488342464, len=64, xfered_len=0x7fff6a29b9a0) at src/gdb/target.c:918
>> #12 0x00000000014bb720 in target_xfer_partial (ops=0x6170000c5900, object=TARGET_OBJECT_RAW_MEMORY, annex=0x0, readbuf=0x6080000ad3bc "", writebuf=0x0, offset=140737488342464, len=64, xfered_len=0x7fff6a29b9a0) at src/gdb/target.c:1148
>> #13 0x00000000014bc3b5 in target_read_partial (ops=0x6170000c5900, object=TARGET_OBJECT_RAW_MEMORY, annex=0x0, buf=0x6080000ad3bc "", offset=140737488342464, len=64, xfered_len=0x7fff6a29b9a0) at src/gdb/target.c:1380
>> #14 0x00000000014bc593 in target_read (ops=0x6170000c5900, object=TARGET_OBJECT_RAW_MEMORY, annex=0x0, buf=0x6080000ad3bc "", offset=140737488342464, len=64) at src/gdb/target.c:1419
>> #15 0x00000000014bbd4d in target_read_raw_memory (memaddr=0x7fffffffcdc0, myaddr=0x6080000ad3bc "", len=64) at src/gdb/target.c:1252
>> #16 0x0000000000bf27df in dcache_read_line (dcache=0x6060001eddc0, db=0x6080000ad3a0) at src/gdb/dcache.c:336
>> #17 0x0000000000bf2b72 in dcache_peek_byte (dcache=0x6060001eddc0, addr=0x7fffffffcdd8, ptr=0x6020001231b0 "") at src/gdb/dcache.c:403
>> #18 0x0000000000bf3103 in dcache_read_memory_partial (ops=0x6170000c5900, dcache=0x6060001eddc0, memaddr=0x7fffffffcdd8, myaddr=0x6020001231b0 "", len=8, xfered_len=0x7fff6a29bf20) at src/gdb/dcache.c:484
>> #19 0x00000000014bafe9 in memory_xfer_partial_1 (ops=0x6170000c5900, object=TARGET_OBJECT_STACK_MEMORY, readbuf=0x6020001231b0 "", writebuf=0x0, memaddr=140737488342488, len=8, xfered_len=0x7fff6a29bf20) at src/gdb/target.c:1034
>> #20 0x00000000014bb212 in memory_xfer_partial (ops=0x6170000c5900, object=TARGET_OBJECT_STACK_MEMORY, readbuf=0x6020001231b0 "", writebuf=0x0, memaddr=140737488342488, len=8, xfered_len=0x7fff6a29bf20) at src/gdb/target.c:1076
>> #21 0x00000000014bb6b3 in target_xfer_partial (ops=0x6170000c5900, object=TARGET_OBJECT_STACK_MEMORY, annex=0x0, readbuf=0x6020001231b0 "", writebuf=0x0, offset=140737488342488, len=8, xfered_len=0x7fff6a29bf20) at src/gdb/target.c:1133
>> #22 0x000000000164564d in read_value_memory (val=0x60f000029440, bit_offset=0, stack=1, memaddr=0x7fffffffcdd8, buffer=0x6020001231b0 "", length=8) at src/gdb/valops.c:956
>> #23 0x0000000001680fff in value_fetch_lazy_memory (val=0x60f000029440) at src/gdb/value.c:3764
>> #24 0x0000000001681efd in value_fetch_lazy (val=0x60f000029440) at src/gdb/value.c:3910
>> #25 0x0000000001676143 in value_optimized_out (value=0x60f000029440) at src/gdb/value.c:1411
>> #26 0x0000000000e0fcb8 in frame_register_unwind (next_frame=0x6210066bfde0, regnum=16, optimizedp=0x7fff6a29c200, unavailablep=0x7fff6a29c240, lvalp=0x7fff6a29c2c0, addrp=0x7fff6a29c300, realnump=0x7fff6a29c280, bufferp=0x7fff6a29c3a0 "@\304)j\377\177") at src/gdb/frame.c:1144
>> #27 0x0000000000e10418 in frame_unwind_register (next_frame=0x6210066bfde0, regnum=16, buf=0x7fff6a29c3a0 "@\304)j\377\177") at src/gdb/frame.c:1196
>> #28 0x0000000000f00431 in i386_unwind_pc (gdbarch=0x6210043d0110, next_frame=0x6210066bfde0) at src/gdb/i386-tdep.c:1969
>> #29 0x0000000000e39724 in gdbarch_unwind_pc (gdbarch=0x6210043d0110, next_frame=0x6210066bfde0) at src/gdb/gdbarch.c:3056
>> #30 0x0000000000c2ea90 in dwarf2_tailcall_sniffer_first (this_frame=0x6210066bfde0, tailcall_cachep=0x6210066bfee0, entry_cfa_sp_offsetp=0x0) at src/gdb/dwarf2/frame-tailcall.c:423
>> #31 0x0000000000c36bdb in dwarf2_frame_cache (this_frame=0x6210066bfde0, this_cache=0x6210066bfdf8) at src/gdb/dwarf2/frame.c:1198
>> #32 0x0000000000c36eb3 in dwarf2_frame_this_id (this_frame=0x6210066bfde0, this_cache=0x6210066bfdf8, this_id=0x6210066bfe40) at src/gdb/dwarf2/frame.c:1226
>>
>> Note that remote_target::readchar in frame #3 throws
>> TARGET_CLOSE_ERROR after that remote_unpush_target in frame #3
>> returns.
>>
>> The problem is that that TARGET_CLOSE_ERROR is swallowed by
>
> `that that`
>

That wasn't a typo, compare with:

 The problem is that this TARGET_CLOSE_ERROR
 The problem is that those TARGET_CLOSE_ERRORs
 The problem is that these TARGET_CLOSE_ERRORs

https://english.stackexchange.com/questions/3418/how-do-you-handle-that-that-the-double-that-problem

I'll say "that the" instead to avoid confusion.

> Otherwise, LGTM.
>
> Simon
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 2/3] Fix crash if connection drops in scoped_restore_current_thread's ctor, part 2

Pedro Alves-2
In reply to this post by Simon Marchi-4
On 7/9/20 4:31 AM, Simon Marchi wrote:

> On 2020-07-08 7:31 p.m., Pedro Alves wrote:
>> Running the testsuite against an Asan-enabled build of GDB makes
>> gdb.base/multi-target.exp expose this bug.
>>
>> scoped_restore_current_thread's ctor calls get_frame_id to record the
>> selected frame's ID to restore later.  If the frame ID hasn't been
>> computed yet, it will be computed on the spot, and that will usually
>> require accessing the target's memory and registers.  If the remote
>> connection closes, while we're computing the frame ID, the remote
>> target exits its inferiors, unpushes itself, and throws a
>> TARGET_CLOSE_ERROR error.  Exiting the inferiors deletes the
>> inferior's threads.
>>
>> scoped_restore_current_thread increments the current thread's refcount
>> to prevent the thread from being deleted from under its feet.
>> However, the code that does that isn't considering the case of the
>> thread being deleted from within get_frame_id.  It only increments the
>> refcount _after_ get_frame_id returns.  So if the current thread is
>> indeed deleted, the
>>
>>      tp->incref ();
>>
>> statement references a stale TP pointer.
>>
>> Incrementing the refcounts earlier fixes it.
>>
>> We should probably also let the TARGET_CLOSE_ERROR error propagate in
>> this case.  That alone would fix it, though it seems better to tweak
>> the refcount handling too.
>
> So, when the target closes while we (scoped_restore_current_thread) own
> a reference on the inferior and thread, the inferior and thread are still
> destroyed, and so we shouldn't decref them?

Aw, no, I got confused and misremembered how exceptions in ctors work.
The dtor for scoped_restore_current_thread isn't called, and I assumed
it was called.  We need to decref the inferior and thread before letting
the exception propagate, otherwise we leak them.  I'm testing the updated
version of the patch below, which does that:

          /* Better let this propagate.  */
          if (ex.error == TARGET_CLOSE_ERROR)
            {
              m_thread->decref ();
              m_inf->decref ();
              throw;
            }

It passes multi-target.exp with Asan-enabled GDB.  Running the full
testsuite now.

From 1ad36a4b892fc4425d6f24c298713eeafece7b04 Mon Sep 17 00:00:00 2001
From: Pedro Alves <[hidden email]>
Date: Tue, 7 Jul 2020 01:50:10 +0100
Subject: [PATCH] Fix crash if connection drops in
 scoped_restore_current_thread's ctor, part 2

Running the testsuite against an Asan-enabled build of GDB makes
gdb.base/multi-target.exp expose this bug.

scoped_restore_current_thread's ctor calls get_frame_id to record the
selected frame's ID to restore later.  If the frame ID hasn't been
computed yet, it will be computed on the spot, and that will usually
require accessing the target's memory and registers.  If the remote
connection closes, while we're computing the frame ID, the remote
target exits its inferiors, unpushes itself, and throws a
TARGET_CLOSE_ERROR error.  Exiting the inferiors deletes the
inferior's threads.

scoped_restore_current_thread increments the current thread's refcount
to prevent the thread from being deleted from under its feet.
However, the code that does that isn't considering the case of the
thread being deleted from within get_frame_id.  It only increments the
refcount _after_ get_frame_id returns.  So if the current thread is
indeed deleted, the

     tp->incref ();

statement references a stale TP pointer.

Incrementing the refcounts earlier fixes it.

We should probably also let the TARGET_CLOSE_ERROR error propagate in
this case.  That alone would fix it, though it seems better to tweak
the refcount handling too.

gdb/ChangeLog:

        * thread.c
        (scoped_restore_current_thread::scoped_restore_current_thread):
        Incref the thread before calling get_frame_id instead of after.
        Let TARGET_CLOSE_ERROR propagate.
---
 gdb/thread.c | 23 +++++++++++++++--------
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/gdb/thread.c b/gdb/thread.c
index f0722d3588..a3c2be7dd0 100644
--- a/gdb/thread.c
+++ b/gdb/thread.c
@@ -1433,15 +1433,17 @@ scoped_restore_current_thread::~scoped_restore_current_thread ()
 
 scoped_restore_current_thread::scoped_restore_current_thread ()
 {
-  m_thread = NULL;
   m_inf = current_inferior ();
+  m_inf->incref ();
 
   if (inferior_ptid != null_ptid)
     {
-      thread_info *tp = inferior_thread ();
+      m_thread = inferior_thread ();
+      m_thread->incref ();
+
       struct frame_info *frame;
 
-      m_was_stopped = tp->state == THREAD_STOPPED;
+      m_was_stopped = m_thread->state == THREAD_STOPPED;
       if (m_was_stopped
   && target_has_registers
   && target_has_stack
@@ -1466,13 +1468,18 @@ scoped_restore_current_thread::scoped_restore_current_thread ()
  {
   m_selected_frame_id = null_frame_id;
   m_selected_frame_level = -1;
- }
 
-      tp->incref ();
-      m_thread = tp;
+  /* Better let this propagate.  */
+  if (ex.error == TARGET_CLOSE_ERROR)
+    {
+      m_thread->decref ();
+      m_inf->decref ();
+      throw;
+    }
+ }
     }
-
-  m_inf->incref ();
+  else
+    m_thread = NULL;
 }
 
 /* See gdbthread.h.  */

base-commit: ad8464f799a4c96c7ab8bdfec3f95846cf54f9b0
prerequisite-patch-id: 32ffdda7d7d774bc4df88bf848bcb796559b53ce
prerequisite-patch-id: 02021b74355b70debd344a6e445285c67dfef7d6
prerequisite-patch-id: c87fcf5a54f6805967cbf8ab107606c57d9ecf52
prerequisite-patch-id: ac7dee583d0ffa519c9d1cd89d27664bca68d8c1
prerequisite-patch-id: eac59ae2ea85d2d51e5be1b03e88a5641cc12c22
prerequisite-patch-id: 13da42ad04dc8e2e3bd6a556a0be0e17cf23669b
prerequisite-patch-id: fd3f09fdb58ddc1c595ea014716851f4c8fca48c
prerequisite-patch-id: 8230262cb24020a5b37e915843180e940807ba1f
--
2.14.5

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 3/3] Make scoped_restore_current_thread's cdtors exception free (RFC)

Pedro Alves-2
In reply to this post by Simon Marchi-4
On 7/9/20 4:49 AM, Simon Marchi wrote:

> On 2020-07-08 7:31 p.m., Pedro Alves wrote:
>> If the remote target closes while we're reading registers/memory for
>> restoring the selected frame in scoped_restore_current_thread's dtor,
>> the corresponding TARGET_CLOSE_ERROR error is swallowed by the
>> scoped_restore_current_thread's dtor, because letting exceptions
>> escape from a dtor is bad.  It isn't great to lose that errors like
>> that, though.  I've been thinking about how to avoid it, and I came up
>> with this patch.
>>
>> The idea here is to make scoped_restore_current_thread's dtor do as
>> little as possible, to avoid any work that might throw in the first
>> place.  And to do that, instead of having the dtor call
>> restore_selected_frame, which re-finds the previously selected frame,
>> just record the frame_id/level of the desired selected frame, and have
>> get_selected_frame find the frame the next time it is called.  In
>> effect, this implements most of Cagney's suggestion, here:
>>
>>   /* On demand, create the selected frame and then return it.  If the
>>      selected frame can not be created, this function prints then throws
>>      an error.  When MESSAGE is non-NULL, use it for the error message,
>>      otherwize use a generic error message.  */
>>   /* FIXME: cagney/2002-11-28: At present, when there is no selected
>>      frame, this function always returns the current (inner most) frame.
>>      It should instead, when a thread has previously had its frame
>>      selected (but not resumed) and the frame cache invalidated, find
>>      and then return that thread's previously selected frame.  */
>>   extern struct frame_info *get_selected_frame (const char *message);
>>
>> The only thing missing to fully implement that would be to make
>> reinit_frame_cache just clear selected_frame instead of calling
>> select_frame(NULL), and the call select_frame(NULL) explicitly in the
>> places where we really wanted reinit_frame_cache to go back to the
>> current frame too.  That can done separately, though, I'm not
>> proposing to do that in this patch.
>>
>> restore_selected_frame should really move from thread.c to frame.c,
>> but I didn't do that here, just to avoid churn in the patch while it
>> collects comments.  I will do that as a preparatory patch if people
>> agree with this approach.
>>
>> Incidentally, this patch alone would fix the crashes fixed by the
>> previous patches in the series, because with this,
>> scoped_restore_current_thread's constructor doesn't throw either.
>
> I don't understand all the code changes, the but the idea sounds fine
> to me.
>
>> -/* Select frame FI (or NULL - to invalidate the current frame).  */
>> +/* Select frame FI (or NULL - to invalidate the selected frame).  */
>>  
>>  void
>>  select_frame (struct frame_info *fi)
>>  {
>>    selected_frame = fi;
>> +  selected_frame_level = frame_relative_level (fi);
>> +  if (selected_frame_level == 0)
>> +    {
>> +      /* Treat the current frame especially -- we want to always
>> + save/restore it without warning, even if the frame ID changes
>> + (see restore_selected_frame).  Also get_frame_id may access
>> + the target's registers/memory, and thus skipping get_frame_id
>> + optimizes the common case.  */
>> +      selected_frame_level = -1;
>> +      selected_frame_id = null_frame_id;
>> +    }
>> +  else
>> +    selected_frame_id = get_frame_id (fi);
>> +
>
> I don't really understand this part, why don't we want to set selected_frame_level
> and selected_frame_id when the level is 0.  I'm more interested by why it wouldn't
> be correct or how it would break things, rather than the optimization aspect.
>
> Simon
>

At first, I was recording frame 0 normally, without that special case.
But running the testsuite revealed regressions in a couple testcases:

 gdb.python/py-unwind-maint.exp
 gdb.server/bkpt-other-inferior.exp

Both are related to the get_frame_id call.  Before the patch, get_frame_id
isn't called on the current frame until you try to backtrace from it.
Adding the get_frame_id call makes the gdb.python/py-unwind-maint.exp testcase
print the Python unwinder callbacks in a different order, unexpected
by the testcase.  I didn't look too deeply into this one, but I suspect
it would just be a matter of adjusting the testcase's expectations.

The gdb.server/bkpt-other-inferior.exp one though is what got me
thinking.  The testcase makes sure that setting a breakpoint in a
function that doesn't exist in the remote inferior does not cause
remote protocol traffic.  After the patch, without the special casing,
the testcase would fail because the get_frame_id call, coming from

 check_frame_language_change  # called after every command
  -> get_selected_frame
    -> restore_selected_frame
      -> select_frame(get_current_frame())
         -> get_frame_id

would cause registers and memory to be read from the remote target (when
restoring the selected frame).  Those accesses aren't wrong, but they
aren't the kind that the bug the testcase is looking for.  Those were
about spurious/incorrect remote protocol accesses when parsing the
function's prologue.

Neither of these cases were strictly incorrect, though they got me
thinking, and I came to the conclusion that warning when we fail to
re-find the current frame is pointless, and that avoids having
unbreak the testcases mentioned, or even redo them differently in
the gdb.server/bkpt-other-inferior.exp case.

I've updated the comment to make it clearer with an example.

I've also polished the patch some more.  I now renamed
the current restore_selected_frame to lookup_selected_frame,
to give space to the new save_selected_frame/restore_selected_frame
pair.  select_frame_lazy is now restore_selected_frame.
save_selected_frame/restore_selected_frame are now noexcept, and
their intro comments explain why.

I declared lookup_selected_frame in frame.h already, thinking that
it's easier if I move lookup_selected_frame from thread.c to frame.c
after this is in, instead of before.

I rewrote most of the comments.  For example, I think the
selected_frame_id/selected_frame_level/selected_frame comments are now
much clearer.

And I made scoped_restore_selected_frame save/restore the language
too.  I was only doing that in scoped_restore_current_thread before.

Let me know what you think of this version.

From 1353dbd062debba4056c2413678d3438f7713ca3 Mon Sep 17 00:00:00 2001
From: Pedro Alves <[hidden email]>
Date: Thu, 9 Jul 2020 11:31:29 +0100
Subject: [PATCH] Make scoped_restore_current_thread's cdtors exception free
 (RFC)

If the remote target closes while we're reading registers/memory for
restoring the selected frame in scoped_restore_current_thread's dtor,
the corresponding TARGET_CLOSE_ERROR error is swallowed by the
scoped_restore_current_thread's dtor, because letting exceptions
escape from a dtor is bad.  It isn't great to lose that errors like
that, though.  I've been thinking about how to avoid it, and I came up
with this patch.

The idea here is to make scoped_restore_current_thread's dtor do as
little as possible, to avoid any work that might throw in the first
place.  And to do that, instead of having the dtor call
restore_selected_frame, which re-finds the previously selected frame,
just record the frame_id/level of the desired selected frame, and have
get_selected_frame find the frame the next time it is called.  In
effect, this implements most of Cagney's suggestion, here:

  /* On demand, create the selected frame and then return it.  If the
     selected frame can not be created, this function prints then throws
     an error.  When MESSAGE is non-NULL, use it for the error message,
     otherwize use a generic error message.  */
  /* FIXME: cagney/2002-11-28: At present, when there is no selected
     frame, this function always returns the current (inner most) frame.
     It should instead, when a thread has previously had its frame
     selected (but not resumed) and the frame cache invalidated, find
     and then return that thread's previously selected frame.  */
  extern struct frame_info *get_selected_frame (const char *message);

The only thing missing to fully implement that would be to make
reinit_frame_cache just clear selected_frame instead of calling
select_frame(NULL), and the call select_frame(NULL) explicitly in the
places where we really wanted reinit_frame_cache to go back to the
current frame too.  That can done separately, though, I'm not
proposing to do that in this patch.

Note that this patch renames restore_selected_frame to
lookup_selected_frame, and adds a new restore_selected_frame function
that doesn't throw, to be paired with the also-new save_selected_frame
function.

lookup_selected_frame should really move from thread.c to frame.c, but
I didn't do that here, just to avoid churn in the patch while it
collects comments.  I did make it extern and declared it in frame.h
already, preparing for the move.  I will do the move as a follow up
patch if people agree with this approach.

Incidentally, this patch alone would fix the crashes fixed by the
previous patches in the series, because with this,
scoped_restore_current_thread's constructor doesn't throw either.

gdb/ChangeLog:

        * blockframe.c (block_innermost_frame): Use get_selected_frame.
        * frame.c
        (scoped_restore_selected_frame::scoped_restore_selected_frame):
        Use save_selected_frame.  Save language as well.
        (scoped_restore_selected_frame::~scoped_restore_selected_frame):
        Use restore_selected_frame, and restore language as well.
        (selected_frame_id, selected_frame_level): New.
        (selected_frame): Update comments.
        (save_selected_frame, restore_selected_frame): New.
        (get_selected_frame): Use lookup_selected_frame.
        (get_selected_frame_if_set): Delete.
        (select_frame): Record selected_frame_level and selected_frame_id.
        * frame.h (scoped_restore_selected_frame) <m_level, m_lang>: New
        fields.
        (get_selected_frame_if_set): Delete declaration.
        (select_frame): Update comments.
        (save_selected_frame, restore_selected_frame)
        (lookup_selected_frame): Declare.
        * gdbthread.h (scoped_restore_current_thread) <m_lang>: New field.
        * stack.c (select_frame_command_core, frame_command_core): Use
        get_selected_frame.
        * thread.c (restore_selected_frame): Rename to ...
        (lookup_selected_frame): ... this and make extern.  Select the
        current frame if the frame level is -1.
        (scoped_restore_current_thread::restore): Also restore the
        language.
        (scoped_restore_current_thread::~scoped_restore_current_thread):
        Don't try/catch.
        (scoped_restore_current_thread::scoped_restore_current_thread):
        Save the language as well.  Use save_selected_frame.
---
 gdb/blockframe.c |   6 +---
 gdb/frame.c      | 101 +++++++++++++++++++++++++++++++++++++++++++------------
 gdb/frame.h      |  38 +++++++++++++++++----
 gdb/gdbthread.h  |   4 +++
 gdb/stack.c      |   9 +++--
 gdb/thread.c     |  67 +++++++++---------------------------
 6 files changed, 137 insertions(+), 88 deletions(-)

diff --git a/gdb/blockframe.c b/gdb/blockframe.c
index 05c26bc2c2..706b6db92c 100644
--- a/gdb/blockframe.c
+++ b/gdb/blockframe.c
@@ -448,14 +448,10 @@ find_gnu_ifunc_target_type (CORE_ADDR resolver_funaddr)
 struct frame_info *
 block_innermost_frame (const struct block *block)
 {
-  struct frame_info *frame;
-
   if (block == NULL)
     return NULL;
 
-  frame = get_selected_frame_if_set ();
-  if (frame == NULL)
-    frame = get_current_frame ();
+  frame_info *frame = get_selected_frame (NULL);
   while (frame != NULL)
     {
       const struct block *frame_block = get_frame_block (frame, NULL);
diff --git a/gdb/frame.c b/gdb/frame.c
index ff27b9f00e..aab501ad67 100644
--- a/gdb/frame.c
+++ b/gdb/frame.c
@@ -297,17 +297,15 @@ frame_stash_invalidate (void)
 /* See frame.h  */
 scoped_restore_selected_frame::scoped_restore_selected_frame ()
 {
-  m_fid = get_frame_id (get_selected_frame (NULL));
+  m_lang = current_language->la_language;
+  save_selected_frame (&m_fid, &m_level);
 }
 
 /* See frame.h  */
 scoped_restore_selected_frame::~scoped_restore_selected_frame ()
 {
-  frame_info *frame = frame_find_by_id (m_fid);
-  if (frame == NULL)
-    warning (_("Unable to restore previously selected frame."));
-  else
-    select_frame (frame);
+  restore_selected_frame (m_fid, m_level);
+  set_language (m_lang);
 }
 
 /* Flag to control debugging.  */
@@ -1641,10 +1639,51 @@ get_current_frame (void)
 }
 
 /* The "selected" stack frame is used by default for local and arg
-   access.  May be zero, for no selected frame.  */
-
+   access.  */
+
+/* The "single source of truth" for the selected frame is the
+   SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL pair.  Frame IDs can be
+   saved/restored across reinitializing the frame cache, while
+   frame_info pointers can't (frame_info objects are invalidated).  If
+   we know the corresponding frame_info object, it is cached in
+   SELECTED_FRAME.  If SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL are
+   null_ptid / -1, and the target has stack and is stopped, the
+   selected frame is the current (innermost) frame, otherwise there's
+   no selected frame.  */
+static frame_id selected_frame_id = null_frame_id;
+static int selected_frame_level = -1;
+
+/* The cached frame_info object pointing to the selected frame.
+   Looked up on demand by get_selected_frame.  */
 static struct frame_info *selected_frame;
 
+/* See frame.h.  */
+
+void
+save_selected_frame (frame_id *frame_id, int *frame_level)
+  noexcept
+{
+  *frame_id = selected_frame_id;
+  *frame_level = selected_frame_level;
+}
+
+/* See frame.h.  */
+
+void
+restore_selected_frame (frame_id a_frame_id, int frame_level)
+  noexcept
+{
+  /* get_selected_frame_info never returns level == 0, so we shouldn't
+     see it here either.  */
+  gdb_assert (frame_level != 0);
+
+  selected_frame_id = a_frame_id;
+  selected_frame_level = frame_level;
+
+  /* Will be looked up latter by get_seleted_frame.  */
+  selected_frame = nullptr;
+}
+
 int
 has_stack_frames (void)
 {
@@ -1682,24 +1721,14 @@ get_selected_frame (const char *message)
     {
       if (message != NULL && !has_stack_frames ())
  error (("%s"), message);
-      /* Hey!  Don't trust this.  It should really be re-finding the
- last selected frame of the currently selected thread.  This,
- though, is better than nothing.  */
-      select_frame (get_current_frame ());
+
+      lookup_selected_frame (selected_frame_id, selected_frame_level);
     }
   /* There is always a frame.  */
   gdb_assert (selected_frame != NULL);
   return selected_frame;
 }
 
-/* If there is a selected frame, return it.  Otherwise, return NULL.  */
-
-struct frame_info *
-get_selected_frame_if_set (void)
-{
-  return selected_frame;
-}
-
 /* This is a variant of get_selected_frame() which can be called when
    the inferior does not have a frame; in that case it will return
    NULL instead of calling error().  */
@@ -1712,12 +1741,42 @@ deprecated_safe_get_selected_frame (void)
   return get_selected_frame (NULL);
 }
 
-/* Select frame FI (or NULL - to invalidate the current frame).  */
+/* Select frame FI (or NULL - to invalidate the selected frame).  */
 
 void
 select_frame (struct frame_info *fi)
 {
   selected_frame = fi;
+  selected_frame_level = frame_relative_level (fi);
+  if (selected_frame_level == 0)
+    {
+      /* Treat the current frame especially -- we want to always
+ save/restore it without warning, even if the frame ID changes
+ (see lookup_selected_frame).  E.g.:
+
+  // The current frame is selected, the target had just stopped.
+  {
+    scoped_restore_selected_frame restore_frame;
+    some_operation_that_changes_the_stack ();
+  }
+  // scoped_restore_selected_frame's dtor runs, but the
+  // original frame_id can't be found.  No matter whether it
+  // is found or not, we still end up with the now-current
+  // frame selected.  Warning in lookup_selected_frame in this
+  // case seems pointless.
+
+ Also get_frame_id may access the target's registers/memory,
+ and thus skipping get_frame_id optimizes the common case.
+
+ Saving the selected frame this way makes get_selected_frame
+ and restore_current_frame return/re-select whatever frame is
+ the innermost (current) then.  */
+      selected_frame_level = -1;
+      selected_frame_id = null_frame_id;
+    }
+  else
+    selected_frame_id = get_frame_id (fi);
+
   /* NOTE: cagney/2002-05-04: FI can be NULL.  This occurs when the
      frame is being invalidated.  */
 
diff --git a/gdb/frame.h b/gdb/frame.h
index e835d49f9c..97d50b645c 100644
--- a/gdb/frame.h
+++ b/gdb/frame.h
@@ -181,8 +181,14 @@ class scoped_restore_selected_frame
 
 private:
 
-  /* The ID of the previously selected frame.  */
+  /* The ID and level of the previously selected frame.  */
   struct frame_id m_fid;
+  int m_level;
+
+  /* Save/restore the language as well, because selecting a frame
+     changes the current language to the frame's language if "set
+     language auto".  */
+  enum language m_lang;
 };
 
 /* Methods for constructing and comparing Frame IDs.  */
@@ -329,13 +335,33 @@ extern void reinit_frame_cache (void);
    and then return that thread's previously selected frame.  */
 extern struct frame_info *get_selected_frame (const char *message);
 
-/* If there is a selected frame, return it.  Otherwise, return NULL.  */
-extern struct frame_info *get_selected_frame_if_set (void);
-
-/* Select a specific frame.  NULL, apparently implies re-select the
-   inner most frame.  */
+/* Select a specific frame.  NULL implies re-select the inner most
+   frame.  */
 extern void select_frame (struct frame_info *);
 
+/* Save the frame ID and frame level of the selected frame in FRAME_ID
+   and FRAME_LEVEL, to be restored later with restore_selected_frame.
+   This is preferred over getting the same info out of
+   get_selected_frame directly because this function does not create
+   the selected-frame's frame_info object if it hasn't been created
+   yet, and thus doesn't throw.  */
+extern void save_selected_frame (frame_id *frame_id, int *frame_level)
+  noexcept;
+
+/* Restore selected frame as saved with save_selected_frame.  Does not
+   try to find the corresponding frame_info object.  Instead the next
+   call to get_selected_frame will look it up and cache the result.
+   This function does not throw, it is designed to be safe to called
+   from the destructors of RAII types.  */
+extern void restore_selected_frame (frame_id frame_id, int frame_level)
+  noexcept;
+
+/* Lookup the frame_info object for the selected frame FRAME_ID /
+   FRAME_LEVEL and cache the result.  If FRAME_LEVEL > 0 and the
+   originally selected frame isn't found, warn and select the
+   innermost (current) frame.  */
+extern void lookup_selected_frame (frame_id frame_id, int frame_level);
+
 /* Given a FRAME, return the next (more inner, younger) or previous
    (more outer, older) frame.  */
 extern struct frame_info *get_prev_frame (struct frame_info *);
diff --git a/gdb/gdbthread.h b/gdb/gdbthread.h
index 0166b2000f..edfdf98b3d 100644
--- a/gdb/gdbthread.h
+++ b/gdb/gdbthread.h
@@ -667,6 +667,10 @@ class scoped_restore_current_thread
   frame_id m_selected_frame_id;
   int m_selected_frame_level;
   bool m_was_stopped;
+  /* Save/restore the language as well, because selecting a frame
+     changes the current language to the frame's language if "set
+     language auto".  */
+  enum language m_lang;
 };
 
 /* Returns a pointer into the thread_info corresponding to
diff --git a/gdb/stack.c b/gdb/stack.c
index 265e764dc2..93de451a12 100644
--- a/gdb/stack.c
+++ b/gdb/stack.c
@@ -1836,9 +1836,9 @@ trailing_outermost_frame (int count)
 static void
 select_frame_command_core (struct frame_info *fi, bool ignored)
 {
-  struct frame_info *prev_frame = get_selected_frame_if_set ();
+  frame_info *prev_frame = get_selected_frame (NULL);
   select_frame (fi);
-  if (get_selected_frame_if_set () != prev_frame)
+  if (get_selected_frame (NULL) != prev_frame)
     gdb::observers::user_selected_context_changed.notify (USER_SELECTED_FRAME);
 }
 
@@ -1857,10 +1857,9 @@ select_frame_for_mi (struct frame_info *fi)
 static void
 frame_command_core (struct frame_info *fi, bool ignored)
 {
-  struct frame_info *prev_frame = get_selected_frame_if_set ();
-
+  frame_info *prev_frame = get_selected_frame (nullptr);
   select_frame (fi);
-  if (get_selected_frame_if_set () != prev_frame)
+  if (get_selected_frame (nullptr) != prev_frame)
     gdb::observers::user_selected_context_changed.notify (USER_SELECTED_FRAME);
   else
     print_selected_thread_frame (current_uiout, USER_SELECTED_FRAME);
diff --git a/gdb/thread.c b/gdb/thread.c
index a3c2be7dd0..e0b49abf0c 100644
--- a/gdb/thread.c
+++ b/gdb/thread.c
@@ -1325,20 +1325,25 @@ switch_to_thread (process_stratum_target *proc_target, ptid_t ptid)
   switch_to_thread (thr);
 }
 
-static void
-restore_selected_frame (struct frame_id a_frame_id, int frame_level)
+/* See frame.h.  */
+
+void
+lookup_selected_frame (struct frame_id a_frame_id, int frame_level)
 {
   struct frame_info *frame = NULL;
   int count;
 
-  /* This means there was no selected frame.  */
+  /* This either means there was no selected frame, or the selected
+     frame was the innermost (the current frame).  */
   if (frame_level == -1)
     {
-      select_frame (NULL);
+      select_frame (get_current_frame ());
       return;
     }
 
-  gdb_assert (frame_level >= 0);
+  /* select_frame never saves 0 in selected_frame_level, so we
+     shouldn't see it here.  */
+  gdb_assert (frame_level > 0);
 
   /* Restore by level first, check if the frame id is the same as
      expected.  If that fails, try restoring by frame id.  If that
@@ -1409,22 +1414,14 @@ scoped_restore_current_thread::restore ()
       && target_has_stack
       && target_has_memory)
     restore_selected_frame (m_selected_frame_id, m_selected_frame_level);
+
+  set_language (m_lang);
 }
 
 scoped_restore_current_thread::~scoped_restore_current_thread ()
 {
   if (!m_dont_restore)
-    {
-      try
- {
-  restore ();
- }
-      catch (const gdb_exception &ex)
- {
-  /* We're in a dtor, there's really nothing else we can do
-     but swallow the exception.  */
- }
-    }
+    restore ();
 
   if (m_thread != NULL)
     m_thread->decref ();
@@ -1436,47 +1433,15 @@ scoped_restore_current_thread::scoped_restore_current_thread ()
   m_inf = current_inferior ();
   m_inf->incref ();
 
+  m_lang = current_language->la_language;
+
   if (inferior_ptid != null_ptid)
     {
       m_thread = inferior_thread ();
       m_thread->incref ();
 
-      struct frame_info *frame;
-
       m_was_stopped = m_thread->state == THREAD_STOPPED;
-      if (m_was_stopped
-  && target_has_registers
-  && target_has_stack
-  && target_has_memory)
- {
-  /* When processing internal events, there might not be a
-     selected frame.  If we naively call get_selected_frame
-     here, then we can end up reading debuginfo for the
-     current frame, but we don't generally need the debuginfo
-     at this point.  */
-  frame = get_selected_frame_if_set ();
- }
-      else
- frame = NULL;
-
-      try
- {
-  m_selected_frame_id = get_frame_id (frame);
-  m_selected_frame_level = frame_relative_level (frame);
- }
-      catch (const gdb_exception_error &ex)
- {
-  m_selected_frame_id = null_frame_id;
-  m_selected_frame_level = -1;
-
-  /* Better let this propagate.  */
-  if (ex.error == TARGET_CLOSE_ERROR)
-    {
-      m_thread->decref ();
-      m_inf->decref ();
-      throw;
-    }
- }
+      save_selected_frame (&m_selected_frame_id, &m_selected_frame_level);
     }
   else
     m_thread = NULL;

base-commit: ad8464f799a4c96c7ab8bdfec3f95846cf54f9b0
prerequisite-patch-id: 32ffdda7d7d774bc4df88bf848bcb796559b53ce
prerequisite-patch-id: 02021b74355b70debd344a6e445285c67dfef7d6
prerequisite-patch-id: c87fcf5a54f6805967cbf8ab107606c57d9ecf52
prerequisite-patch-id: ac7dee583d0ffa519c9d1cd89d27664bca68d8c1
prerequisite-patch-id: eac59ae2ea85d2d51e5be1b03e88a5641cc12c22
prerequisite-patch-id: 13da42ad04dc8e2e3bd6a556a0be0e17cf23669b
prerequisite-patch-id: fd3f09fdb58ddc1c595ea014716851f4c8fca48c
prerequisite-patch-id: 8230262cb24020a5b37e915843180e940807ba1f
prerequisite-patch-id: 59bcf41af4057da60e15e3f35fae1fd4e0bf36b9
--
2.14.5


Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 3/3] Make scoped_restore_current_thread's cdtors exception free (RFC)

Pedro Alves-2
On 7/9/20 12:56 PM, Pedro Alves wrote:

> On 7/9/20 4:49 AM, Simon Marchi wrote:
>>>  void
>>>  select_frame (struct frame_info *fi)
>>>  {
>>>    selected_frame = fi;
>>> +  selected_frame_level = frame_relative_level (fi);
>>> +  if (selected_frame_level == 0)
>>> +    {
>>> +      /* Treat the current frame especially -- we want to always
>>> + save/restore it without warning, even if the frame ID changes
>>> + (see restore_selected_frame).  Also get_frame_id may access
>>> + the target's registers/memory, and thus skipping get_frame_id
>>> + optimizes the common case.  */
>>> +      selected_frame_level = -1;
>>> +      selected_frame_id = null_frame_id;
>>> +    }
>>> +  else
>>> +    selected_frame_id = get_frame_id (fi);
>>> +
>>
>> I don't really understand this part, why don't we want to set selected_frame_level
>> and selected_frame_id when the level is 0.  I'm more interested by why it wouldn't
>> be correct or how it would break things, rather than the optimization aspect.
>>
>
> At first, I was recording frame 0 normally, without that special case.
> But running the testsuite revealed regressions in a couple testcases:
>
>  gdb.python/py-unwind-maint.exp
>  gdb.server/bkpt-other-inferior.exp
>
> Both are related to the get_frame_id call.  Before the patch, get_frame_id
> isn't called on the current frame until you try to backtrace from it.
> Adding the get_frame_id call makes the gdb.python/py-unwind-maint.exp testcase
> print the Python unwinder callbacks in a different order, unexpected
> by the testcase.  I didn't look too deeply into this one, but I suspect
> it would just be a matter of adjusting the testcase's expectations.
>
> The gdb.server/bkpt-other-inferior.exp one though is what got me
> thinking.  The testcase makes sure that setting a breakpoint in a
> function that doesn't exist in the remote inferior does not cause
> remote protocol traffic.  After the patch, without the special casing,
> the testcase would fail because the get_frame_id call, coming from
>
>  check_frame_language_change  # called after every command
>   -> get_selected_frame
>     -> restore_selected_frame
>       -> select_frame(get_current_frame())
>          -> get_frame_id
>
> would cause registers and memory to be read from the remote target (when
> restoring the selected frame).  Those accesses aren't wrong, but they
> aren't the kind that the bug the testcase is looking for.  Those were
> about spurious/incorrect remote protocol accesses when parsing the
> function's prologue.
>
> Neither of these cases were strictly incorrect, though they got me
> thinking, and I came to the conclusion that warning when we fail to
> re-find the current frame is pointless, and that avoids having
> unbreak the testcases mentioned, or even redo them differently in
> the gdb.server/bkpt-other-inferior.exp case.
>
> I've updated the comment to make it clearer with an example.
>
> I've also polished the patch some more.  I now renamed
> the current restore_selected_frame to lookup_selected_frame,
> to give space to the new save_selected_frame/restore_selected_frame
> pair.  select_frame_lazy is now restore_selected_frame.
> save_selected_frame/restore_selected_frame are now noexcept, and
> their intro comments explain why.
>
> I declared lookup_selected_frame in frame.h already, thinking that
> it's easier if I move lookup_selected_frame from thread.c to frame.c
> after this is in, instead of before.
>
> I rewrote most of the comments.  For example, I think the
> selected_frame_id/selected_frame_level/selected_frame comments are now
> much clearer.
>
> And I made scoped_restore_selected_frame save/restore the language
> too.  I was only doing that in scoped_restore_current_thread before.
>
> Let me know what you think of this version.

I've pushed this, along with all the PR26199 patches to:

 users/palves/pr26199-busy-loop-target-events

(The version pushed has a couple comment typos fixed compared to the
one posted.)
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 1/3] Fix crash if connection drops in scoped_restore_current_thread's ctor, part 1

Simon Marchi-4
In reply to this post by Pedro Alves-2
On 2020-07-09 6:51 a.m., Pedro Alves wrote:

>> `that that`
>>
>
> That wasn't a typo, compare with:
>
>  The problem is that this TARGET_CLOSE_ERROR
>  The problem is that those TARGET_CLOSE_ERRORs
>  The problem is that these TARGET_CLOSE_ERRORs
>
> https://english.stackexchange.com/questions/3418/how-do-you-handle-that-that-the-double-that-problem
>
> I'll say "that the" instead to avoid confusion.

Haha, I see now.  Yeah I think any alternative to `that that` is clearer.

Simon
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 2/3] Fix crash if connection drops in scoped_restore_current_thread's ctor, part 2

Simon Marchi-4
In reply to this post by Pedro Alves-2
On 2020-07-09 7:12 a.m., Pedro Alves wrote:

> Aw, no, I got confused and misremembered how exceptions in ctors work.
> The dtor for scoped_restore_current_thread isn't called, and I assumed
> it was called.  We need to decref the inferior and thread before letting
> the exception propagate, otherwise we leak them.  I'm testing the updated
> version of the patch below, which does that:
>
>  /* Better let this propagate.  */
>  if (ex.error == TARGET_CLOSE_ERROR)
>    {
>      m_thread->decref ();
>      m_inf->decref ();
>      throw;
>    }
>
> It passes multi-target.exp with Asan-enabled GDB.  Running the full
> testsuite now.

Ok, I also had the reasoning about exceptions thrown by constructors,
and thought it might be on purpose that it wasn't decref'ed.

If the references were kept by fields of scoped_restore_current_thread,
then it would work automatically, because their destructor would get
called.

Simon
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 3/3] Make scoped_restore_current_thread's cdtors exception free (RFC)

Simon Marchi-4
In reply to this post by Pedro Alves-2
> I've also polished the patch some more.  I now renamed
> the current restore_selected_frame to lookup_selected_frame,
> to give space to the new save_selected_frame/restore_selected_frame
> pair.  select_frame_lazy is now restore_selected_frame.
> save_selected_frame/restore_selected_frame are now noexcept, and
> their intro comments explain why.

I noticed there is also a `restore_selected_frame` in infrun.c... I don't
know if it's replicating features also implemented in frame.c?

> @@ -1641,10 +1639,51 @@ get_current_frame (void)
>  }
>  
>  /* The "selected" stack frame is used by default for local and arg
> -   access.  May be zero, for no selected frame.  */
> -
> +   access.  */
> +
> +/* The "single source of truth" for the selected frame is the
> +   SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL pair.  Frame IDs can be
> +   saved/restored across reinitializing the frame cache, while
> +   frame_info pointers can't (frame_info objects are invalidated).  If
> +   we know the corresponding frame_info object, it is cached in
> +   SELECTED_FRAME.  If SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL are
> +   null_ptid / -1, and the target has stack and is stopped, the
> +   selected frame is the current (innermost) frame, otherwise there's
> +   no selected frame.  */

Pedantically, the `otherwise` could let the reader think that if
SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL are not null_ptid / -1, then there
is no selected frame.  I'd suggest moving this out to its own sentence to be
clear.

I'd also make it more explicit here that when the innermost frame is selected,
SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL.  Currently, it says that if
SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL are null_ptid / -1, it means the
innermost frame is selected, but it doesn't imply that if the innermost frame
is selected, then SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL are null_ptid / -1
(difference between if and "if and only if").

I think it would help to separate that in paragraphes to ease reading.

Also, I just noticed that you comment uses `null_ptid` instead of `null_frame_id`,
which just shows how much `null_ptid` is engraved in your brain :).  I also
re-worked the comment for 15 minutes before noticing :).

So:

/* The "single source of truth" for the selected frame is the
   SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL pair.

   Frame IDs can be saved/restored across reinitializing the frame cache, while
   frame_info pointers can't (frame_info objects are invalidated).  If we know
   the corresponding frame_info object, it is cached in SELECTED_FRAME.

   If SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL are null_frame_id / -1, and the
   target has stack and is stopped, the selected frame is the current
   (innermost) frame.  This means that SELECTED_FRAME_LEVEL is never 0 and
   SELECTED_FRAME_ID is never the ID of the innermost frame.

   If SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL are null_frame_id / -1, and the
   target has no stack or is executing, the selected, then there's no selected
   frame.  */

> +static frame_id selected_frame_id = null_frame_id;
> +static int selected_frame_level = -1;
> +
> +/* The cached frame_info object pointing to the selected frame.
> +   Looked up on demand by get_selected_frame.  */
>  static struct frame_info *selected_frame;

Ah ok, in my previous reply I almost said: "so this essentially becomes
a cache for selected_frame_id/selected_frame_level", but changed it because
I *thought* I had understood that it wasn't the case when the frame_level
was > 0.  But if we can word it this way, that makes it simpler to understand.

>  
> +/* See frame.h.  */
> +
> +void
> +save_selected_frame (frame_id *frame_id, int *frame_level)
> +  noexcept
> +{
> +  *frame_id = selected_frame_id;
> +  *frame_level = selected_frame_level;
> +}
> +
> +/* See frame.h.  */
> +
> +void
> +restore_selected_frame (frame_id a_frame_id, int frame_level)
> +  noexcept
> +{
> +  /* get_selected_frame_info never returns level == 0, so we shouldn't
> +     see it here either.  */
> +  gdb_assert (frame_level != 0);

We could also assert that a_frame_id can be null_frame_id IFF frame_level is -1.

> +
> +  selected_frame_id = a_frame_id;
> +  selected_frame_level = frame_level;
> +
> +  /* Will be looked up latter by get_seleted_frame.  */
> +  selected_frame = nullptr;
> +}
> +
>  int
>  has_stack_frames (void)
>  {
> @@ -1682,24 +1721,14 @@ get_selected_frame (const char *message)
>      {
>        if (message != NULL && !has_stack_frames ())
>   error (("%s"), message);
> -      /* Hey!  Don't trust this.  It should really be re-finding the
> - last selected frame of the currently selected thread.  This,
> - though, is better than nothing.  */
> -      select_frame (get_current_frame ());
> +
> +      lookup_selected_frame (selected_frame_id, selected_frame_level);

Could you fix (in this patch or another one) the comment of `get_selected_frame`
to be `/* See frame.h.  */` and make sure that the version in frame.h is the
most up to date one?

> @@ -1712,12 +1741,42 @@ deprecated_safe_get_selected_frame (void)
>    return get_selected_frame (NULL);
>  }
>  
> -/* Select frame FI (or NULL - to invalidate the current frame).  */
> +/* Select frame FI (or NULL - to invalidate the selected frame).  */
>  
>  void
>  select_frame (struct frame_info *fi)
>  {
>    selected_frame = fi;
> +  selected_frame_level = frame_relative_level (fi);
> +  if (selected_frame_level == 0)
> +    {
> +      /* Treat the current frame especially -- we want to always
> + save/restore it without warning, even if the frame ID changes
> + (see lookup_selected_frame).  E.g.:
> +
> +  // The current frame is selected, the target had just stopped.
> +  {
> +    scoped_restore_selected_frame restore_frame;
> +    some_operation_that_changes_the_stack ();
> +  }
> +  // scoped_restore_selected_frame's dtor runs, but the
> +  // original frame_id can't be found.  No matter whether it
> +  // is found or not, we still end up with the now-current
> +  // frame selected.  Warning in lookup_selected_frame in this
> +  // case seems pointless.
> +
> + Also get_frame_id may access the target's registers/memory,
> + and thus skipping get_frame_id optimizes the common case.
> +
> + Saving the selected frame this way makes get_selected_frame
> + and restore_current_frame return/re-select whatever frame is

restore_selected_frame

> @@ -329,13 +335,33 @@ extern void reinit_frame_cache (void);
>     and then return that thread's previously selected frame.  */
>  extern struct frame_info *get_selected_frame (const char *message);
>  
> -/* If there is a selected frame, return it.  Otherwise, return NULL.  */
> -extern struct frame_info *get_selected_frame_if_set (void);
> -
> -/* Select a specific frame.  NULL, apparently implies re-select the
> -   inner most frame.  */
> +/* Select a specific frame.  NULL implies re-select the inner most
> +   frame.  */
>  extern void select_frame (struct frame_info *);
>  
> +/* Save the frame ID and frame level of the selected frame in FRAME_ID
> +   and FRAME_LEVEL, to be restored later with restore_selected_frame.
> +   This is preferred over getting the same info out of
> +   get_selected_frame directly because this function does not create
> +   the selected-frame's frame_info object if it hasn't been created
> +   yet, and thus doesn't throw.  */
> +extern void save_selected_frame (frame_id *frame_id, int *frame_level)
> +  noexcept;
> +
> +/* Restore selected frame as saved with save_selected_frame.  Does not
> +   try to find the corresponding frame_info object.  Instead the next
> +   call to get_selected_frame will look it up and cache the result.
> +   This function does not throw, it is designed to be safe to called
> +   from the destructors of RAII types.  */
> +extern void restore_selected_frame (frame_id frame_id, int frame_level)
> +  noexcept;
> +
> +/* Lookup the frame_info object for the selected frame FRAME_ID /
> +   FRAME_LEVEL and cache the result.  If FRAME_LEVEL > 0 and the
> +   originally selected frame isn't found, warn and select the
> +   innermost (current) frame.  */
> +extern void lookup_selected_frame (frame_id frame_id, int frame_level);

As I mentioned above, I think the comments would be easier to read with
a bit more newlines.  In general, I like to have one short sentence by
itself first, that says what the function does at a very high level.  A
bit like in man pages you have a short description next to the name:

  rm - remove files or directories
  sudo, sudoedit — execute a command as another user
  ssh — OpenSSH remote login client

Then, you can go in details about the behavior of the function in following
paragraphs, making sure to split different ideas in different paragraphs.

At first, I thought it would just be nitpicking to ask people to space
out their comments and code a bit more, but now I think it really does
make a difference in readability (at least, it helps me).

> diff --git a/gdb/stack.c b/gdb/stack.c
> index 265e764dc2..93de451a12 100644
> --- a/gdb/stack.c
> +++ b/gdb/stack.c
> @@ -1836,9 +1836,9 @@ trailing_outermost_frame (int count)
>  static void
>  select_frame_command_core (struct frame_info *fi, bool ignored)
>  {
> -  struct frame_info *prev_frame = get_selected_frame_if_set ();
> +  frame_info *prev_frame = get_selected_frame (NULL);
>    select_frame (fi);
> -  if (get_selected_frame_if_set () != prev_frame)
> +  if (get_selected_frame (NULL) != prev_frame)

I'm telling people that we try to use `nullptr` for new code.  I don't
know what's your opinion on this.  I would just like to have a simple
and clear so we don't have to wonder which of `NULL` and `nullptr` to
use.

>      gdb::observers::user_selected_context_changed.notify (USER_SELECTED_FRAME);
>  }
>  
> @@ -1857,10 +1857,9 @@ select_frame_for_mi (struct frame_info *fi)
>  static void
>  frame_command_core (struct frame_info *fi, bool ignored)
>  {
> -  struct frame_info *prev_frame = get_selected_frame_if_set ();
> -
> +  frame_info *prev_frame = get_selected_frame (nullptr);
>    select_frame (fi);
> -  if (get_selected_frame_if_set () != prev_frame)
> +  if (get_selected_frame (nullptr) != prev_frame)

... especially that you've used `nullptr` here :).

Although maybe that in this case, get_selected_frame could have a default
parameter value of `nullptr`.

> diff --git a/gdb/thread.c b/gdb/thread.c
> index a3c2be7dd0..e0b49abf0c 100644
> --- a/gdb/thread.c
> +++ b/gdb/thread.c
> @@ -1325,20 +1325,25 @@ switch_to_thread (process_stratum_target *proc_target, ptid_t ptid)
>    switch_to_thread (thr);
>  }
>  
> -static void
> -restore_selected_frame (struct frame_id a_frame_id, int frame_level)
> +/* See frame.h.  */
> +
> +void
> +lookup_selected_frame (struct frame_id a_frame_id, int frame_level)
>  {
>    struct frame_info *frame = NULL;
>    int count;
>  
> -  /* This means there was no selected frame.  */
> +  /* This either means there was no selected frame, or the selected
> +     frame was the innermost (the current frame).  */

It's a bit heavy to always precise that `innermost` means the `current`
frame.  Let's choose one name and stick to it, it will be clear enough
on its own.  I'd use `current`, since that's how the functions are named
(e.g. get_current_frame).  I understand there might be a bit confusion
between `selected` and `current` for newcomers, but once you know the
distinction, it's pretty clear.

I'd also add to that comment, what is the action we take as a result.

So, maybe:

  /* This either means there was no selected frame, or the selected frame was
     the current one.  In either case, try to select the current frame.  */


> @@ -1409,22 +1414,14 @@ scoped_restore_current_thread::restore ()
>        && target_has_stack
>        && target_has_memory)
>      restore_selected_frame (m_selected_frame_id, m_selected_frame_level);
> +
> +  set_language (m_lang);
>  }
>  
>  scoped_restore_current_thread::~scoped_restore_current_thread ()
>  {
>    if (!m_dont_restore)
> -    {
> -      try
> - {
> -  restore ();
> - }
> -      catch (const gdb_exception &ex)
> - {
> -  /* We're in a dtor, there's really nothing else we can do
> -     but swallow the exception.  */
> - }
> -    }
> +    restore ();

I don't know if it would be worth it, but I'd like if we could assert (abort
GDB) if an exception does try to exit the destructor.  The `restore` method
is non-trivial and calls into other non-trivial functions, so it would be
possible for a change far far away to cause that to happen.

What do you think of keeping the try/catch, but using `gdb_assert_not_reached`
in it?

Simon
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 2/3] Fix crash if connection drops in scoped_restore_current_thread's ctor, part 2

Pedro Alves-2
In reply to this post by Simon Marchi-4
On 7/9/20 3:16 PM, Simon Marchi wrote:

> On 2020-07-09 7:12 a.m., Pedro Alves wrote:
>> Aw, no, I got confused and misremembered how exceptions in ctors work.
>> The dtor for scoped_restore_current_thread isn't called, and I assumed
>> it was called.  We need to decref the inferior and thread before letting
>> the exception propagate, otherwise we leak them.  I'm testing the updated
>> version of the patch below, which does that:
>>
>>  /* Better let this propagate.  */
>>  if (ex.error == TARGET_CLOSE_ERROR)
>>    {
>>      m_thread->decref ();
>>      m_inf->decref ();
>>      throw;
>>    }
>>
>> It passes multi-target.exp with Asan-enabled GDB.  Running the full
>> testsuite now.
>
> Ok, I also had the reasoning about exceptions thrown by constructors,
> and thought it might be on purpose that it wasn't decref'ed.
>
> If the references were kept by fields of scoped_restore_current_thread,
> then it would work automatically, because their destructor would get
> called.

Yup.  I wasn't proposing that due to a header dependency issue which
I wasn't looking at addressing right now, but here's a version that
works around it.

From 55813e06b771787d3120f4a57f6e4f8ae25e5ebb Mon Sep 17 00:00:00 2001
From: Pedro Alves <[hidden email]>
Date: Thu, 9 Jul 2020 18:14:09 +0100
Subject: [PATCH] Fix crash if connection drops in
 scoped_restore_current_thread's ctor, part 2

Running the testsuite against an Asan-enabled build of GDB makes
gdb.base/multi-target.exp expose this bug.

scoped_restore_current_thread's ctor calls get_frame_id to record the
selected frame's ID to restore later.  If the frame ID hasn't been
computed yet, it will be computed on the spot, and that will usually
require accessing the target's memory and registers.  If the remote
connection closes, while we're computing the frame ID, the remote
target exits its inferiors, unpushes itself, and throws a
TARGET_CLOSE_ERROR error.  Exiting the inferiors deletes the
inferior's threads.

scoped_restore_current_thread increments the current thread's refcount
to prevent the thread from being deleted from under its feet.
However, the code that does that isn't considering the case of the
thread being deleted from within get_frame_id.  It only increments the
refcount _after_ get_frame_id returns.  So if the current thread is
indeed deleted, the

     tp->incref ();

statement references a stale TP pointer.

Incrementing the refcounts earlier fixes it.

We should probably also let the TARGET_CLOSE_ERROR error propagate in
this case.  That alone would fix it, though it seems better to tweak
the refcount handling too.  And to avoid having to manually decref
before throwing, convert to use gdb::ref_ptr.

Unfortunately, we can't define inferior_ref in inferior.h and then use
it in scoped_restore_current_thread, because
scoped_restore_current_thread is defined before inferior is
(inferior.h includes gdbthread.h).  To break that dependency, we would
have to move scoped_restore_current_thread to its own header.  I'm not
doing that here.

gdb/ChangeLog:

        * gdbthread.h (inferior_ref): Define.
        (scoped_restore_current_thread) <m_thread>: Now a thread_info_ref.
        (scoped_restore_current_thread) <m_inf>: Now an inferior_ref.
        * thread.c
        (scoped_restore_current_thread::restore):
        Adjust to gdb::ref_ptr.
        (scoped_restore_current_thread::~scoped_restore_current_thread):
        Remove manual decref handling.
        (scoped_restore_current_thread::scoped_restore_current_thread):
        Adjust to use
        inferior_ref::new_reference/thread_info_ref::new_reference.
        Incref the thread before calling get_frame_id instead of after.
        Let TARGET_CLOSE_ERROR propagate.
---
 gdb/gdbthread.h | 14 ++++++++++----
 gdb/thread.c    | 25 ++++++++++---------------
 2 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/gdb/gdbthread.h b/gdb/gdbthread.h
index 0166b2000f..ab5771fdb4 100644
--- a/gdb/gdbthread.h
+++ b/gdb/gdbthread.h
@@ -395,6 +395,13 @@ class thread_info : public refcounted_object
 using thread_info_ref
   = gdb::ref_ptr<struct thread_info, refcounted_object_ref_policy>;
 
+/* A gdb::ref_ptr pointer to an inferior.  This would ideally be in
+   inferior.h, but it can't due to header dependencies (inferior.h
+   includes gdbthread.h).  */
+
+using inferior_ref
+  = gdb::ref_ptr<struct inferior, refcounted_object_ref_policy>;
+
 /* Create an empty thread list, or empty the existing one.  */
 extern void init_thread_list (void);
 
@@ -660,10 +667,9 @@ class scoped_restore_current_thread
   void restore ();
 
   bool m_dont_restore = false;
-  /* Use the "class" keyword here, because of a clash with a "thread_info"
-     function in the Darwin API.  */
-  class thread_info *m_thread;
-  inferior *m_inf;
+  thread_info_ref m_thread;
+  inferior_ref m_inf;
+
   frame_id m_selected_frame_id;
   int m_selected_frame_level;
   bool m_was_stopped;
diff --git a/gdb/thread.c b/gdb/thread.c
index f0722d3588..4dce1ef82a 100644
--- a/gdb/thread.c
+++ b/gdb/thread.c
@@ -1396,9 +1396,9 @@ scoped_restore_current_thread::restore ()
  in the mean time exited (or killed, detached, etc.), then don't revert
  back to it, but instead simply drop back to no thread selected.  */
       && m_inf->pid != 0)
-    switch_to_thread (m_thread);
+    switch_to_thread (m_thread.get ());
   else
-    switch_to_inferior_no_thread (m_inf);
+    switch_to_inferior_no_thread (m_inf.get ());
 
   /* The running state of the originally selected thread may have
      changed, so we have to recheck it here.  */
@@ -1425,23 +1425,19 @@ scoped_restore_current_thread::~scoped_restore_current_thread ()
      but swallow the exception.  */
  }
     }
-
-  if (m_thread != NULL)
-    m_thread->decref ();
-  m_inf->decref ();
 }
 
 scoped_restore_current_thread::scoped_restore_current_thread ()
 {
-  m_thread = NULL;
-  m_inf = current_inferior ();
+  m_inf = inferior_ref::new_reference (current_inferior ());
 
   if (inferior_ptid != null_ptid)
     {
-      thread_info *tp = inferior_thread ();
+      m_thread = thread_info_ref::new_reference (inferior_thread ());
+
       struct frame_info *frame;
 
-      m_was_stopped = tp->state == THREAD_STOPPED;
+      m_was_stopped = m_thread->state == THREAD_STOPPED;
       if (m_was_stopped
   && target_has_registers
   && target_has_stack
@@ -1466,13 +1462,12 @@ scoped_restore_current_thread::scoped_restore_current_thread ()
  {
   m_selected_frame_id = null_frame_id;
   m_selected_frame_level = -1;
- }
 
-      tp->incref ();
-      m_thread = tp;
+  /* Better let this propagate.  */
+  if (ex.error == TARGET_CLOSE_ERROR)
+    throw;
+ }
     }
-
-  m_inf->incref ();
 }
 
 /* See gdbthread.h.  */

base-commit: ad8464f799a4c96c7ab8bdfec3f95846cf54f9b0
prerequisite-patch-id: 32ffdda7d7d774bc4df88bf848bcb796559b53ce
prerequisite-patch-id: 02021b74355b70debd344a6e445285c67dfef7d6
prerequisite-patch-id: c87fcf5a54f6805967cbf8ab107606c57d9ecf52
prerequisite-patch-id: ac7dee583d0ffa519c9d1cd89d27664bca68d8c1
prerequisite-patch-id: eac59ae2ea85d2d51e5be1b03e88a5641cc12c22
prerequisite-patch-id: 13da42ad04dc8e2e3bd6a556a0be0e17cf23669b
prerequisite-patch-id: fd3f09fdb58ddc1c595ea014716851f4c8fca48c
prerequisite-patch-id: 8230262cb24020a5b37e915843180e940807ba1f
--
2.14.5

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 2/3] Fix crash if connection drops in scoped_restore_current_thread's ctor, part 2

Simon Marchi-4
On 2020-07-09 1:23 p.m., Pedro Alves wrote:
> Unfortunately, we can't define inferior_ref in inferior.h and then use
> it in scoped_restore_current_thread, because
> scoped_restore_current_thread is defined before inferior is
> (inferior.h includes gdbthread.h).  To break that dependency, we would
> have to move scoped_restore_current_thread to its own header.  I'm not
> doing that here.

Cool, that's fine with me.

Simon

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 3/3] Make scoped_restore_current_thread's cdtors exception free (RFC)

Pedro Alves-2
In reply to this post by Simon Marchi-4
On 7/9/20 4:40 PM, Simon Marchi wrote:
>> I've also polished the patch some more.  I now renamed
>> the current restore_selected_frame to lookup_selected_frame,
>> to give space to the new save_selected_frame/restore_selected_frame
>> pair.  select_frame_lazy is now restore_selected_frame.
>> save_selected_frame/restore_selected_frame are now noexcept, and
>> their intro comments explain why.
>
> I noticed there is also a `restore_selected_frame` in infrun.c... I don't
> know if it's replicating features also implemented in frame.c?

Yeah, I saw that too, but didn't look too deeply, given I was more
looking at validating the whole idea.  I think that can be replaced,
yes.  Done now.

BTW, I don't know why I missed it before, but we _already_ don't
warn the user if we fail to restore frame #0:

  /* Warn the user.  */
  if (frame_level > 0 && !current_uiout->is_mi_like_p ())
      ^^^^^^^^^^^^^^^
    {

And that came in with:

 commit 6c95b8df7fef5273da71c34775918c554aae0ea8
 Author:     Pedro Alves <[hidden email]>
 AuthorDate: Mon Oct 19 09:51:43 2009 +0000

    2009-10-19  Pedro Alves  <[hidden email]>
                Stan Shebs  <[hidden email]>
   
            Add base multi-executable/process support to GDB.

I'm pretty sure I wrote that bit of the patch...

>
>> @@ -1641,10 +1639,51 @@ get_current_frame (void)
>>  }
>>  
>>  /* The "selected" stack frame is used by default for local and arg
>> -   access.  May be zero, for no selected frame.  */
>> -
>> +   access.  */
>> +
>> +/* The "single source of truth" for the selected frame is the
>> +   SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL pair.  Frame IDs can be
>> +   saved/restored across reinitializing the frame cache, while
>> +   frame_info pointers can't (frame_info objects are invalidated).  If
>> +   we know the corresponding frame_info object, it is cached in
>> +   SELECTED_FRAME.  If SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL are
>> +   null_ptid / -1, and the target has stack and is stopped, the
>> +   selected frame is the current (innermost) frame, otherwise there's
>> +   no selected frame.  */
>
> Pedantically, the `otherwise` could let the reader think that if
> SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL are not null_ptid / -1, then there
> is no selected frame.  I'd suggest moving this out to its own sentence to be
> clear.
>
> I'd also make it more explicit here that when the innermost frame is selected,
> SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL.  Currently, it says that if
> SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL are null_ptid / -1, it means the
> innermost frame is selected, but it doesn't imply that if the innermost frame
> is selected, then SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL are null_ptid / -1
> (difference between if and "if and only if").
>
> I think it would help to separate that in paragraphes to ease reading.
>
> Also, I just noticed that you comment uses `null_ptid` instead of `null_frame_id`,
> which just shows how much `null_ptid` is engraved in your brain :).  I also
> re-worked the comment for 15 minutes before noticing :).
>
> So:
>
> /* The "single source of truth" for the selected frame is the
>    SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL pair.
>
>    Frame IDs can be saved/restored across reinitializing the frame cache, while
>    frame_info pointers can't (frame_info objects are invalidated).  If we know
>    the corresponding frame_info object, it is cached in SELECTED_FRAME.
>
>    If SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL are null_frame_id / -1, and the
>    target has stack and is stopped, the selected frame is the current
>    (innermost) frame.  This means that SELECTED_FRAME_LEVEL is never 0 and
>    SELECTED_FRAME_ID is never the ID of the innermost frame.
>
>    If SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL are null_frame_id / -1, and the
>    target has no stack or is executing, the selected, then there's no selected
>    frame.  */

I like it.

>
>> +static frame_id selected_frame_id = null_frame_id;
>> +static int selected_frame_level = -1;
>> +
>> +/* The cached frame_info object pointing to the selected frame.
>> +   Looked up on demand by get_selected_frame.  */
>>  static struct frame_info *selected_frame;
>
> Ah ok, in my previous reply I almost said: "so this essentially becomes
> a cache for selected_frame_id/selected_frame_level", but changed it because
> I *thought* I had understood that it wasn't the case when the frame_level
> was > 0.  But if we can word it this way, that makes it simpler to understand.
>
>>  
>> +/* See frame.h.  */
>> +
>> +void
>> +save_selected_frame (frame_id *frame_id, int *frame_level)
>> +  noexcept
>> +{
>> +  *frame_id = selected_frame_id;
>> +  *frame_level = selected_frame_level;
>> +}
>> +
>> +/* See frame.h.  */
>> +
>> +void
>> +restore_selected_frame (frame_id a_frame_id, int frame_level)
>> +  noexcept
>> +{
>> +  /* get_selected_frame_info never returns level == 0, so we shouldn't
>> +     see it here either.  */
>> +  gdb_assert (frame_level != 0);
>
> We could also assert that a_frame_id can be null_frame_id IFF frame_level is -1.

Done.

>
>> +
>> +  selected_frame_id = a_frame_id;
>> +  selected_frame_level = frame_level;
>> +
>> +  /* Will be looked up latter by get_seleted_frame.  */
>> +  selected_frame = nullptr;
>> +}
>> +
>>  int
>>  has_stack_frames (void)
>>  {
>> @@ -1682,24 +1721,14 @@ get_selected_frame (const char *message)
>>      {
>>        if (message != NULL && !has_stack_frames ())
>>   error (("%s"), message);
>> -      /* Hey!  Don't trust this.  It should really be re-finding the
>> - last selected frame of the currently selected thread.  This,
>> - though, is better than nothing.  */
>> -      select_frame (get_current_frame ());
>> +
>> +      lookup_selected_frame (selected_frame_id, selected_frame_level);
>
> Could you fix (in this patch or another one) the comment of `get_selected_frame`
> to be `/* See frame.h.  */` and make sure that the version in frame.h is the
> most up to date one?

Done.

>
>> @@ -1712,12 +1741,42 @@ deprecated_safe_get_selected_frame (void)
>>    return get_selected_frame (NULL);
>>  }
>>  
>> -/* Select frame FI (or NULL - to invalidate the current frame).  */
>> +/* Select frame FI (or NULL - to invalidate the selected frame).  */
>>  
>>  void
>>  select_frame (struct frame_info *fi)
>>  {
>>    selected_frame = fi;
>> +  selected_frame_level = frame_relative_level (fi);
>> +  if (selected_frame_level == 0)
>> +    {
>> +      /* Treat the current frame especially -- we want to always
>> + save/restore it without warning, even if the frame ID changes
>> + (see lookup_selected_frame).  E.g.:
>> +
>> +  // The current frame is selected, the target had just stopped.
>> +  {
>> +    scoped_restore_selected_frame restore_frame;
>> +    some_operation_that_changes_the_stack ();
>> +  }
>> +  // scoped_restore_selected_frame's dtor runs, but the
>> +  // original frame_id can't be found.  No matter whether it
>> +  // is found or not, we still end up with the now-current
>> +  // frame selected.  Warning in lookup_selected_frame in this
>> +  // case seems pointless.
>> +
>> + Also get_frame_id may access the target's registers/memory,
>> + and thus skipping get_frame_id optimizes the common case.
>> +
>> + Saving the selected frame this way makes get_selected_frame
>> + and restore_current_frame return/re-select whatever frame is
>
> restore_selected_frame

Fixed.

>
>> @@ -329,13 +335,33 @@ extern void reinit_frame_cache (void);
>>     and then return that thread's previously selected frame.  */
>>  extern struct frame_info *get_selected_frame (const char *message);
>>  
>> -/* If there is a selected frame, return it.  Otherwise, return NULL.  */
>> -extern struct frame_info *get_selected_frame_if_set (void);
>> -
>> -/* Select a specific frame.  NULL, apparently implies re-select the
>> -   inner most frame.  */
>> +/* Select a specific frame.  NULL implies re-select the inner most
>> +   frame.  */
>>  extern void select_frame (struct frame_info *);
>>  
>> +/* Save the frame ID and frame level of the selected frame in FRAME_ID
>> +   and FRAME_LEVEL, to be restored later with restore_selected_frame.
>> +   This is preferred over getting the same info out of
>> +   get_selected_frame directly because this function does not create
>> +   the selected-frame's frame_info object if it hasn't been created
>> +   yet, and thus doesn't throw.  */
>> +extern void save_selected_frame (frame_id *frame_id, int *frame_level)
>> +  noexcept;
>> +
>> +/* Restore selected frame as saved with save_selected_frame.  Does not
>> +   try to find the corresponding frame_info object.  Instead the next
>> +   call to get_selected_frame will look it up and cache the result.
>> +   This function does not throw, it is designed to be safe to called
>> +   from the destructors of RAII types.  */
>> +extern void restore_selected_frame (frame_id frame_id, int frame_level)
>> +  noexcept;
>> +
>> +/* Lookup the frame_info object for the selected frame FRAME_ID /
>> +   FRAME_LEVEL and cache the result.  If FRAME_LEVEL > 0 and the
>> +   originally selected frame isn't found, warn and select the
>> +   innermost (current) frame.  */
>> +extern void lookup_selected_frame (frame_id frame_id, int frame_level);
>
> As I mentioned above, I think the comments would be easier to read with
> a bit more newlines.  In general, I like to have one short sentence by
> itself first, that says what the function does at a very high level.  A
> bit like in man pages you have a short description next to the name:
>
>   rm - remove files or directories
>   sudo, sudoedit — execute a command as another user
>   ssh — OpenSSH remote login client
>
> Then, you can go in details about the behavior of the function in following
> paragraphs, making sure to split different ideas in different paragraphs.
>
> At first, I thought it would just be nitpicking to ask people to space
> out their comments and code a bit more, but now I think it really does
> make a difference in readability (at least, it helps me).
>

Done.

>> diff --git a/gdb/stack.c b/gdb/stack.c
>> index 265e764dc2..93de451a12 100644
>> --- a/gdb/stack.c
>> +++ b/gdb/stack.c
>> @@ -1836,9 +1836,9 @@ trailing_outermost_frame (int count)
>>  static void
>>  select_frame_command_core (struct frame_info *fi, bool ignored)
>>  {
>> -  struct frame_info *prev_frame = get_selected_frame_if_set ();
>> +  frame_info *prev_frame = get_selected_frame (NULL);
>>    select_frame (fi);
>> -  if (get_selected_frame_if_set () != prev_frame)
>> +  if (get_selected_frame (NULL) != prev_frame)
>
> I'm telling people that we try to use `nullptr` for new code.  I don't
> know what's your opinion on this.  I would just like to have a simple
> and clear so we don't have to wonder which of `NULL` and `nullptr` to
> use.

I just tend to use nullptr for new code, unless the surrounding code
is already using NULL.  In this case I just copy/pasted from elsewhere
and didn't even realize it.  NULL and nullptr are both cyan and thus
look the same on my emacs.  :-)

I do tend to stick with writing "NULL" in comments.

>
>>      gdb::observers::user_selected_context_changed.notify (USER_SELECTED_FRAME);
>>  }
>>  
>> @@ -1857,10 +1857,9 @@ select_frame_for_mi (struct frame_info *fi)
>>  static void
>>  frame_command_core (struct frame_info *fi, bool ignored)
>>  {
>> -  struct frame_info *prev_frame = get_selected_frame_if_set ();
>> -
>> +  frame_info *prev_frame = get_selected_frame (nullptr);
>>    select_frame (fi);
>> -  if (get_selected_frame_if_set () != prev_frame)
>> +  if (get_selected_frame (nullptr) != prev_frame)
>
> ... especially that you've used `nullptr` here :).
>
> Although maybe that in this case, get_selected_frame could have a default
> parameter value of `nullptr`.
>

Done.

>> diff --git a/gdb/thread.c b/gdb/thread.c
>> index a3c2be7dd0..e0b49abf0c 100644
>> --- a/gdb/thread.c
>> +++ b/gdb/thread.c
>> @@ -1325,20 +1325,25 @@ switch_to_thread (process_stratum_target *proc_target, ptid_t ptid)
>>    switch_to_thread (thr);
>>  }
>>  
>> -static void
>> -restore_selected_frame (struct frame_id a_frame_id, int frame_level)
>> +/* See frame.h.  */
>> +
>> +void
>> +lookup_selected_frame (struct frame_id a_frame_id, int frame_level)
>>  {
>>    struct frame_info *frame = NULL;
>>    int count;
>>  
>> -  /* This means there was no selected frame.  */
>> +  /* This either means there was no selected frame, or the selected
>> +     frame was the innermost (the current frame).  */
>
> It's a bit heavy to always precise that `innermost` means the `current`
> frame.  Let's choose one name and stick to it, it will be clear enough
> on its own.  I'd use `current`, since that's how the functions are named
> (e.g. get_current_frame).  I understand there might be a bit confusion
> between `selected` and `current` for newcomers, but once you know the
> distinction, it's pretty clear.

Sure.

>
> I'd also add to that comment, what is the action we take as a result.
>
> So, maybe:
>
>   /* This either means there was no selected frame, or the selected frame was
>      the current one.  In either case, try to select the current frame.  */

OK.  I'll remove the "try" though.

>
>
>> @@ -1409,22 +1414,14 @@ scoped_restore_current_thread::restore ()
>>        && target_has_stack
>>        && target_has_memory)
>>      restore_selected_frame (m_selected_frame_id, m_selected_frame_level);
>> +
>> +  set_language (m_lang);
>>  }
>>  
>>  scoped_restore_current_thread::~scoped_restore_current_thread ()
>>  {
>>    if (!m_dont_restore)
>> -    {
>> -      try
>> - {
>> -  restore ();
>> - }
>> -      catch (const gdb_exception &ex)
>> - {
>> -  /* We're in a dtor, there's really nothing else we can do
>> -     but swallow the exception.  */
>> - }
>> -    }
>> +    restore ();
>
> I don't know if it would be worth it, but I'd like if we could assert (abort
> GDB) if an exception does try to exit the destructor.  The `restore` method
> is non-trivial and calls into other non-trivial functions, so it would be
> possible for a change far far away to cause that to happen.

It will already abort.  Destructors are noexcept by default, so if an exception
escapes a destructor, std::terminate() is called, and that calls abort by default.

>
> What do you think of keeping the try/catch, but using `gdb_assert_not_reached`
> in it?

Not sure.  If we do that, we do get a nicer error message.  However if the user
says "n" to "Quit this debugging session" we still abort.

/home/pedro/brno/pedro/gdb/binutils-gdb-2/build/../src/gdb/thread.c:1441: internal-error: scoped_restore_current_thread::~scoped_restore_current_thread(): unexpected exception thrown from destructor: hello
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) n

This is a bug, please report it.  For instructions, see:
<https://www.gnu.org/software/gdb/bugs/>.

/home/pedro/brno/pedro/gdb/binutils-gdb-2/build/../src/gdb/thread.c:1441: internal-error: scoped_restore_current_thread::~scoped_restore_current_thread(): unexpected exception thrown from destructor: hello
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Create a core file of GDB? (y or n) n
terminate called after throwing an instance of 'gdb_exception_quit'
Aborted (core dumped)

Maybe it would be interesting to add a variant of internal_error that did
not throw a quit, so the user could swallow the exception...  Maybe consider
wrapping that as a generic facility to add to all non-trivial RAII destructors
we have?  Like a function that takes a function_view as parameter, so
we would write:

foo::~foo ()
{
  safe_dtor (__FILE__, __LINE__, [&] ()
    {
      restore ();
    });
}

Even better, add a SAFE_DTOR macro using similar magic SCOPE_EXIT
macro uses to be able to write:

foo::~foo ()
{
  SAFE_DTOR { restore (); };
}

Here's the current version of the patch.

From 63310d189bb2516e0fec2b0922041388a5a96374 Mon Sep 17 00:00:00 2001
From: Pedro Alves <[hidden email]>
Date: Thu, 9 Jul 2020 18:26:15 +0100
Subject: [PATCH] Make scoped_restore_current_thread's cdtors exception free
 (RFC)

If the remote target closes while we're reading registers/memory for
restoring the selected frame in scoped_restore_current_thread's dtor,
the corresponding TARGET_CLOSE_ERROR error is swallowed by the
scoped_restore_current_thread's dtor, because letting exceptions
escape from a dtor is bad.  It isn't great to lose that errors like
that, though.  I've been thinking about how to avoid it, and I came up
with this patch.

The idea here is to make scoped_restore_current_thread's dtor do as
little as possible, to avoid any work that might throw in the first
place.  And to do that, instead of having the dtor call
restore_selected_frame, which re-finds the previously selected frame,
just record the frame_id/level of the desired selected frame, and have
get_selected_frame find the frame the next time it is called.  In
effect, this implements most of Cagney's suggestion, here:

  /* On demand, create the selected frame and then return it.  If the
     selected frame can not be created, this function prints then throws
     an error.  When MESSAGE is non-NULL, use it for the error message,
     otherwize use a generic error message.  */
  /* FIXME: cagney/2002-11-28: At present, when there is no selected
     frame, this function always returns the current (inner most) frame.
     It should instead, when a thread has previously had its frame
     selected (but not resumed) and the frame cache invalidated, find
     and then return that thread's previously selected frame.  */
  extern struct frame_info *get_selected_frame (const char *message);

The only thing missing to fully implement that would be to make
reinit_frame_cache just clear selected_frame instead of calling
select_frame(NULL), and the call select_frame(NULL) explicitly in the
places where we really wanted reinit_frame_cache to go back to the
current frame too.  That can done separately, though, I'm not
proposing to do that in this patch.

Note that this patch renames restore_selected_frame to
lookup_selected_frame, and adds a new restore_selected_frame function
that doesn't throw, to be paired with the also-new save_selected_frame
function.

There's a restore_selected_frame function in infrun.c that I think can
be replaced by the new one in frame.c.

Also done in this patch is make the get_selected_frame's parameter be
optional, so that we don't have to pass down nullptr explicitly all
over the place.

lookup_selected_frame should really move from thread.c to frame.c, but
I didn't do that here, just to avoid churn in the patch while it
collects comments.  I did make it extern and declared it in frame.h
already, preparing for the move.  I will do the move as a follow up
patch if people agree with this approach.

Incidentally, this patch alone would fix the crashes fixed by the
previous patches in the series, because with this,
scoped_restore_current_thread's constructor doesn't throw either.

gdb/ChangeLog:

        * blockframe.c (block_innermost_frame): Use get_selected_frame.
        * frame.c
        (scoped_restore_selected_frame::scoped_restore_selected_frame):
        Use save_selected_frame.  Save language as well.
        (scoped_restore_selected_frame::~scoped_restore_selected_frame):
        Use restore_selected_frame, and restore language as well.
        (selected_frame_id, selected_frame_level): New.
        (selected_frame): Update comments.
        (save_selected_frame, restore_selected_frame): New.
        (get_selected_frame): Use lookup_selected_frame.
        (get_selected_frame_if_set): Delete.
        (select_frame): Record selected_frame_level and selected_frame_id.
        * frame.h (scoped_restore_selected_frame) <m_level, m_lang>: New
        fields.
        (get_selected_frame): Make 'message' parameter optional.
        (get_selected_frame_if_set): Delete declaration.
        (select_frame): Update comments.
        (save_selected_frame, restore_selected_frame)
        (lookup_selected_frame): Declare.
        * gdbthread.h (scoped_restore_current_thread) <m_lang>: New field.
        * infrun.c (struct infcall_control_state) <selected_frame_level>:
        New field.
        (save_infcall_control_state): Use save_selected_frame.
        (restore_selected_frame): Delete.
        (restore_infcall_control_state): Use restore_selected_frame.
        * stack.c (select_frame_command_core, frame_command_core): Use
        get_selected_frame.
        * thread.c (restore_selected_frame): Rename to ...
        (lookup_selected_frame): ... this and make extern.  Select the
        current frame if the frame level is -1.
        (scoped_restore_current_thread::restore): Also restore the
        language.
        (scoped_restore_current_thread::~scoped_restore_current_thread):
        Don't try/catch.
        (scoped_restore_current_thread::scoped_restore_current_thread):
        Save the language as well.  Use save_selected_frame.
---
 gdb/blockframe.c |   6 +--
 gdb/frame.c      | 117 +++++++++++++++++++++++++++++++++++++++++++------------
 gdb/frame.h      |  53 +++++++++++++++++++------
 gdb/gdbthread.h  |   4 ++
 gdb/infrun.c     |  40 ++++---------------
 gdb/stack.c      |   9 ++---
 gdb/thread.c     |  64 ++++++++----------------------
 7 files changed, 168 insertions(+), 125 deletions(-)

diff --git a/gdb/blockframe.c b/gdb/blockframe.c
index 05c26bc2c2..0b868d8341 100644
--- a/gdb/blockframe.c
+++ b/gdb/blockframe.c
@@ -448,14 +448,10 @@ find_gnu_ifunc_target_type (CORE_ADDR resolver_funaddr)
 struct frame_info *
 block_innermost_frame (const struct block *block)
 {
-  struct frame_info *frame;
-
   if (block == NULL)
     return NULL;
 
-  frame = get_selected_frame_if_set ();
-  if (frame == NULL)
-    frame = get_current_frame ();
+  frame_info *frame = get_selected_frame ();
   while (frame != NULL)
     {
       const struct block *frame_block = get_frame_block (frame, NULL);
diff --git a/gdb/frame.c b/gdb/frame.c
index ff27b9f00e..86cf9e4b69 100644
--- a/gdb/frame.c
+++ b/gdb/frame.c
@@ -297,17 +297,15 @@ frame_stash_invalidate (void)
 /* See frame.h  */
 scoped_restore_selected_frame::scoped_restore_selected_frame ()
 {
-  m_fid = get_frame_id (get_selected_frame (NULL));
+  m_lang = current_language->la_language;
+  save_selected_frame (&m_fid, &m_level);
 }
 
 /* See frame.h  */
 scoped_restore_selected_frame::~scoped_restore_selected_frame ()
 {
-  frame_info *frame = frame_find_by_id (m_fid);
-  if (frame == NULL)
-    warning (_("Unable to restore previously selected frame."));
-  else
-    select_frame (frame);
+  restore_selected_frame (m_fid, m_level);
+  set_language (m_lang);
 }
 
 /* Flag to control debugging.  */
@@ -1641,10 +1639,63 @@ get_current_frame (void)
 }
 
 /* The "selected" stack frame is used by default for local and arg
-   access.  May be zero, for no selected frame.  */
-
+   access.
+
+   The "single source of truth" for the selected frame is the
+   SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL pair.
+
+   Frame IDs can be saved/restored across reinitializing the frame
+   cache, while frame_info pointers can't (frame_info objects are
+   invalidated).  If we know the corresponding frame_info object, it
+   is cached in SELECTED_FRAME.
+
+   If SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL are null_frame_id / -1,
+   and the target has stack and is stopped, the selected frame is the
+   current (innermost) frame.  This means that SELECTED_FRAME_LEVEL is
+   never 0 and SELECTED_FRAME_ID is never the ID of the innermost
+   frame.
+
+   If SELECTED_FRAME_ID / SELECTED_FRAME_LEVEL are null_frame_id / -1,
+   and the target has no stack or is executing, then there's no
+   selected frame.  */
+static frame_id selected_frame_id = null_frame_id;
+static int selected_frame_level = -1;
+
+/* The cached frame_info object pointing to the selected frame.
+   Looked up on demand by get_selected_frame.  */
 static struct frame_info *selected_frame;
 
+/* See frame.h.  */
+
+void
+save_selected_frame (frame_id *frame_id, int *frame_level)
+  noexcept
+{
+  *frame_id = selected_frame_id;
+  *frame_level = selected_frame_level;
+}
+
+/* See frame.h.  */
+
+void
+restore_selected_frame (frame_id frame_id, int frame_level)
+  noexcept
+{
+  /* save_selected_frame never returns level == 0, so we shouldn't see
+     it here either.  */
+  gdb_assert (frame_level != 0);
+
+  /* FRAME_ID can be null_frame_id only IFF frame_level is -1.  */
+  gdb_assert ((frame_level == -1 && !frame_id_p (frame_id))
+      || (frame_level != -1 && frame_id_p (frame_id)));
+
+  selected_frame_id = frame_id;
+  selected_frame_level = frame_level;
+
+  /* Will be looked up later by get_seleted_frame.  */
+  selected_frame = nullptr;
+}
+
 int
 has_stack_frames (void)
 {
@@ -1671,9 +1722,7 @@ has_stack_frames (void)
   return 1;
 }
 
-/* Return the selected frame.  Always non-NULL (unless there isn't an
-   inferior sufficient for creating a frame) in which case an error is
-   thrown.  */
+/* See frame.h.  */
 
 struct frame_info *
 get_selected_frame (const char *message)
@@ -1682,24 +1731,14 @@ get_selected_frame (const char *message)
     {
       if (message != NULL && !has_stack_frames ())
  error (("%s"), message);
-      /* Hey!  Don't trust this.  It should really be re-finding the
- last selected frame of the currently selected thread.  This,
- though, is better than nothing.  */
-      select_frame (get_current_frame ());
+
+      lookup_selected_frame (selected_frame_id, selected_frame_level);
     }
   /* There is always a frame.  */
   gdb_assert (selected_frame != NULL);
   return selected_frame;
 }
 
-/* If there is a selected frame, return it.  Otherwise, return NULL.  */
-
-struct frame_info *
-get_selected_frame_if_set (void)
-{
-  return selected_frame;
-}
-
 /* This is a variant of get_selected_frame() which can be called when
    the inferior does not have a frame; in that case it will return
    NULL instead of calling error().  */
@@ -1712,12 +1751,42 @@ deprecated_safe_get_selected_frame (void)
   return get_selected_frame (NULL);
 }
 
-/* Select frame FI (or NULL - to invalidate the current frame).  */
+/* Select frame FI (or NULL - to invalidate the selected frame).  */
 
 void
 select_frame (struct frame_info *fi)
 {
   selected_frame = fi;
+  selected_frame_level = frame_relative_level (fi);
+  if (selected_frame_level == 0)
+    {
+      /* Treat the current frame especially -- we want to always
+ save/restore it without warning, even if the frame ID changes
+ (see lookup_selected_frame).  E.g.:
+
+  // The current frame is selected, the target had just stopped.
+  {
+    scoped_restore_selected_frame restore_frame;
+    some_operation_that_changes_the_stack ();
+  }
+  // scoped_restore_selected_frame's dtor runs, but the
+  // original frame_id can't be found.  No matter whether it
+  // is found or not, we still end up with the now-current
+  // frame selected.  Warning in lookup_selected_frame in this
+  // case seems pointless.
+
+ Also get_frame_id may access the target's registers/memory,
+ and thus skipping get_frame_id optimizes the common case.
+
+ Saving the selected frame this way makes get_selected_frame
+ and restore_current_frame return/re-select whatever frame is
+ the innermost (current) then.  */
+      selected_frame_level = -1;
+      selected_frame_id = null_frame_id;
+    }
+  else
+    selected_frame_id = get_frame_id (fi);
+
   /* NOTE: cagney/2002-05-04: FI can be NULL.  This occurs when the
      frame is being invalidated.  */
 
diff --git a/gdb/frame.h b/gdb/frame.h
index e835d49f9c..4901de1bf5 100644
--- a/gdb/frame.h
+++ b/gdb/frame.h
@@ -181,8 +181,14 @@ class scoped_restore_selected_frame
 
 private:
 
-  /* The ID of the previously selected frame.  */
+  /* The ID and level of the previously selected frame.  */
   struct frame_id m_fid;
+  int m_level;
+
+  /* Save/restore the language as well, because selecting a frame
+     changes the current language to the frame's language if "set
+     language auto".  */
+  enum language m_lang;
 };
 
 /* Methods for constructing and comparing Frame IDs.  */
@@ -318,24 +324,49 @@ extern int has_stack_frames (void);
    modifies the target invalidating the frame cache).  */
 extern void reinit_frame_cache (void);
 
-/* On demand, create the selected frame and then return it.  If the
-   selected frame can not be created, this function prints then throws
-   an error.  When MESSAGE is non-NULL, use it for the error message,
-   otherwize use a generic error message.  */
+/* Return the selected frame.  Always returns non-NULL.  If there
+   isn't an inferior sufficient for creating a frame, an error is
+   thrown.  When MESSAGE is non-NULL, use it for the error message,
+   otherwise use a generic error message.  */
 /* FIXME: cagney/2002-11-28: At present, when there is no selected
    frame, this function always returns the current (inner most) frame.
    It should instead, when a thread has previously had its frame
    selected (but not resumed) and the frame cache invalidated, find
    and then return that thread's previously selected frame.  */
-extern struct frame_info *get_selected_frame (const char *message);
-
-/* If there is a selected frame, return it.  Otherwise, return NULL.  */
-extern struct frame_info *get_selected_frame_if_set (void);
+extern struct frame_info *get_selected_frame (const char *message = nullptr);
 
-/* Select a specific frame.  NULL, apparently implies re-select the
-   inner most frame.  */
+/* Select a specific frame.  NULL implies re-select the inner most
+   frame.  */
 extern void select_frame (struct frame_info *);
 
+/* Save the frame ID and frame level of the selected frame in FRAME_ID
+   and FRAME_LEVEL, to be restored later with restore_selected_frame.
+
+   This is preferred over getting the same info out of
+   get_selected_frame directly because this function does not create
+   the selected-frame's frame_info object if it hasn't been created
+   yet, and thus is more efficient and doesn't throw.  */
+extern void save_selected_frame (frame_id *frame_id, int *frame_level)
+  noexcept;
+
+/* Restore selected frame as saved with save_selected_frame.
+
+   Does not try to find the corresponding frame_info object.  Instead
+   the next call to get_selected_frame will look it up and cache the
+   result.
+
+   This function does not throw.  It is designed to be safe to called
+   from the destructors of RAII types.  */
+extern void restore_selected_frame (frame_id frame_id, int frame_level)
+  noexcept;
+
+/* Lookup the frame_info object for the selected frame FRAME_ID /
+   FRAME_LEVEL and cache the result.
+
+   If FRAME_LEVEL > 0 and the originally selected frame isn't found,
+   warn and select the innermost (current) frame.  */
+extern void lookup_selected_frame (frame_id frame_id, int frame_level);
+
 /* Given a FRAME, return the next (more inner, younger) or previous
    (more outer, older) frame.  */
 extern struct frame_info *get_prev_frame (struct frame_info *);
diff --git a/gdb/gdbthread.h b/gdb/gdbthread.h
index ab5771fdb4..da20dd4b4e 100644
--- a/gdb/gdbthread.h
+++ b/gdb/gdbthread.h
@@ -673,6 +673,10 @@ class scoped_restore_current_thread
   frame_id m_selected_frame_id;
   int m_selected_frame_level;
   bool m_was_stopped;
+  /* Save/restore the language as well, because selecting a frame
+     changes the current language to the frame's language if "set
+     language auto".  */
+  enum language m_lang;
 };
 
 /* Returns a pointer into the thread_info corresponding to
diff --git a/gdb/infrun.c b/gdb/infrun.c
index 31266109a6..ad16385029 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -9266,8 +9266,10 @@ struct infcall_control_state
   enum stop_stack_kind stop_stack_dummy = STOP_NONE;
   int stopped_by_random_signal = 0;
 
-  /* ID if the selected frame when the inferior function call was made.  */
+  /* ID and level of the selected frame when the inferior function
+     call was made.  */
   struct frame_id selected_frame_id {};
+  int selected_frame_level = -1;
 };
 
 /* Save all of the information associated with the inferior<==>gdb
@@ -9296,27 +9298,12 @@ save_infcall_control_state ()
   inf_status->stop_stack_dummy = stop_stack_dummy;
   inf_status->stopped_by_random_signal = stopped_by_random_signal;
 
-  inf_status->selected_frame_id = get_frame_id (get_selected_frame (NULL));
+  save_selected_frame (&inf_status->selected_frame_id,
+       &inf_status->selected_frame_level);
 
   return inf_status;
 }
 
-static void
-restore_selected_frame (const frame_id &fid)
-{
-  frame_info *frame = frame_find_by_id (fid);
-
-  /* If inf_status->selected_frame_id is NULL, there was no previously
-     selected frame.  */
-  if (frame == NULL)
-    {
-      warning (_("Unable to restore previously selected frame."));
-      return;
-    }
-
-  select_frame (frame);
-}
-
 /* Restore inferior session state to INF_STATUS.  */
 
 void
@@ -9344,21 +9331,8 @@ restore_infcall_control_state (struct infcall_control_state *inf_status)
 
   if (target_has_stack)
     {
-      /* The point of the try/catch is that if the stack is clobbered,
-         walking the stack might encounter a garbage pointer and
-         error() trying to dereference it.  */
-      try
- {
-  restore_selected_frame (inf_status->selected_frame_id);
- }
-      catch (const gdb_exception_error &ex)
- {
-  exception_fprintf (gdb_stderr, ex,
-     "Unable to restore previously selected frame:\n");
-  /* Error in restoring the selected frame.  Select the
-     innermost frame.  */
-  select_frame (get_current_frame ());
- }
+      restore_selected_frame (inf_status->selected_frame_id,
+      inf_status->selected_frame_level);
     }
 
   delete inf_status;
diff --git a/gdb/stack.c b/gdb/stack.c
index 265e764dc2..dbc02f5ff0 100644
--- a/gdb/stack.c
+++ b/gdb/stack.c
@@ -1836,9 +1836,9 @@ trailing_outermost_frame (int count)
 static void
 select_frame_command_core (struct frame_info *fi, bool ignored)
 {
-  struct frame_info *prev_frame = get_selected_frame_if_set ();
+  frame_info *prev_frame = get_selected_frame ();
   select_frame (fi);
-  if (get_selected_frame_if_set () != prev_frame)
+  if (get_selected_frame () != prev_frame)
     gdb::observers::user_selected_context_changed.notify (USER_SELECTED_FRAME);
 }
 
@@ -1857,10 +1857,9 @@ select_frame_for_mi (struct frame_info *fi)
 static void
 frame_command_core (struct frame_info *fi, bool ignored)
 {
-  struct frame_info *prev_frame = get_selected_frame_if_set ();
-
+  frame_info *prev_frame = get_selected_frame ();
   select_frame (fi);
-  if (get_selected_frame_if_set () != prev_frame)
+  if (get_selected_frame () != prev_frame)
     gdb::observers::user_selected_context_changed.notify (USER_SELECTED_FRAME);
   else
     print_selected_thread_frame (current_uiout, USER_SELECTED_FRAME);
diff --git a/gdb/thread.c b/gdb/thread.c
index 4dce1ef82a..32f2ccdbd6 100644
--- a/gdb/thread.c
+++ b/gdb/thread.c
@@ -1325,20 +1325,26 @@ switch_to_thread (process_stratum_target *proc_target, ptid_t ptid)
   switch_to_thread (thr);
 }
 
-static void
-restore_selected_frame (struct frame_id a_frame_id, int frame_level)
+/* See frame.h.  */
+
+void
+lookup_selected_frame (struct frame_id a_frame_id, int frame_level)
 {
   struct frame_info *frame = NULL;
   int count;
 
-  /* This means there was no selected frame.  */
+  /* This either means there was no selected frame, or the selected
+     frame was the current frame.  In either case, select the current
+     frame.  */
   if (frame_level == -1)
     {
-      select_frame (NULL);
+      select_frame (get_current_frame ());
       return;
     }
 
-  gdb_assert (frame_level >= 0);
+  /* select_frame never saves 0 in SELECTED_FRAME_LEVEL, so we
+     shouldn't see it here.  */
+  gdb_assert (frame_level > 0);
 
   /* Restore by level first, check if the frame id is the same as
      expected.  If that fails, try restoring by frame id.  If that
@@ -1409,64 +1415,28 @@ scoped_restore_current_thread::restore ()
       && target_has_stack
       && target_has_memory)
     restore_selected_frame (m_selected_frame_id, m_selected_frame_level);
+
+  set_language (m_lang);
 }
 
 scoped_restore_current_thread::~scoped_restore_current_thread ()
 {
   if (!m_dont_restore)
-    {
-      try
- {
-  restore ();
- }
-      catch (const gdb_exception &ex)
- {
-  /* We're in a dtor, there's really nothing else we can do
-     but swallow the exception.  */
- }
-    }
+    restore ();
 }
 
 scoped_restore_current_thread::scoped_restore_current_thread ()
 {
   m_inf = inferior_ref::new_reference (current_inferior ());
 
+  m_lang = current_language->la_language;
+
   if (inferior_ptid != null_ptid)
     {
       m_thread = thread_info_ref::new_reference (inferior_thread ());
 
-      struct frame_info *frame;
-
       m_was_stopped = m_thread->state == THREAD_STOPPED;
-      if (m_was_stopped
-  && target_has_registers
-  && target_has_stack
-  && target_has_memory)
- {
-  /* When processing internal events, there might not be a
-     selected frame.  If we naively call get_selected_frame
-     here, then we can end up reading debuginfo for the
-     current frame, but we don't generally need the debuginfo
-     at this point.  */
-  frame = get_selected_frame_if_set ();
- }
-      else
- frame = NULL;
-
-      try
- {
-  m_selected_frame_id = get_frame_id (frame);
-  m_selected_frame_level = frame_relative_level (frame);
- }
-      catch (const gdb_exception_error &ex)
- {
-  m_selected_frame_id = null_frame_id;
-  m_selected_frame_level = -1;
-
-  /* Better let this propagate.  */
-  if (ex.error == TARGET_CLOSE_ERROR)
-    throw;
- }
+      save_selected_frame (&m_selected_frame_id, &m_selected_frame_level);
     }
 }
 

base-commit: ad8464f799a4c96c7ab8bdfec3f95846cf54f9b0
prerequisite-patch-id: 32ffdda7d7d774bc4df88bf848bcb796559b53ce
prerequisite-patch-id: 02021b74355b70debd344a6e445285c67dfef7d6
prerequisite-patch-id: c87fcf5a54f6805967cbf8ab107606c57d9ecf52
prerequisite-patch-id: ac7dee583d0ffa519c9d1cd89d27664bca68d8c1
prerequisite-patch-id: eac59ae2ea85d2d51e5be1b03e88a5641cc12c22
prerequisite-patch-id: 13da42ad04dc8e2e3bd6a556a0be0e17cf23669b
prerequisite-patch-id: fd3f09fdb58ddc1c595ea014716851f4c8fca48c
prerequisite-patch-id: 8230262cb24020a5b37e915843180e940807ba1f
prerequisite-patch-id: b06932368faf983af6404ae711dfed17e5e32c6f
--
2.14.5

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 3/3] Make scoped_restore_current_thread's cdtors exception free (RFC)

Simon Marchi-4
>> I don't know if it would be worth it, but I'd like if we could assert (abort
>> GDB) if an exception does try to exit the destructor.  The `restore` method
>> is non-trivial and calls into other non-trivial functions, so it would be
>> possible for a change far far away to cause that to happen.
>
> It will already abort.  Destructors are noexcept by default, so if an exception
> escapes a destructor, std::terminate() is called, and that calls abort by default.

Oh, didn't know that!  I thought it was just "undefined behavior".

>> What do you think of keeping the try/catch, but using `gdb_assert_not_reached`
>> in it?
>
> Not sure.  If we do that, we do get a nicer error message.  However if the user
> says "n" to "Quit this debugging session" we still abort.
>
> /home/pedro/brno/pedro/gdb/binutils-gdb-2/build/../src/gdb/thread.c:1441: internal-error: scoped_restore_current_thread::~scoped_restore_current_thread(): unexpected exception thrown from destructor: hello
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> Quit this debugging session? (y or n) n
>
> This is a bug, please report it.  For instructions, see:
> <https://www.gnu.org/software/gdb/bugs/>.
>
> /home/pedro/brno/pedro/gdb/binutils-gdb-2/build/../src/gdb/thread.c:1441: internal-error: scoped_restore_current_thread::~scoped_restore_current_thread(): unexpected exception thrown from destructor: hello
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> Create a core file of GDB? (y or n) n
> terminate called after throwing an instance of 'gdb_exception_quit'
> Aborted (core dumped)
>
> Maybe it would be interesting to add a variant of internal_error that did
> not throw a quit, so the user could swallow the exception...  Maybe consider
> wrapping that as a generic facility to add to all non-trivial RAII destructors
> we have?  Like a function that takes a function_view as parameter, so
> we would write:
>
> foo::~foo ()
> {
>   safe_dtor (__FILE__, __LINE__, [&] ()
>     {
>       restore ();
>     });
> }
>
> Even better, add a SAFE_DTOR macro using similar magic SCOPE_EXIT
> macro uses to be able to write:
>
> foo::~foo ()
> {
>   SAFE_DTOR { restore (); };
> }

That's fancier than what I hoped for :).  My goal was just to make sure
we catch it if we ever make a change that causes an exception to escape.
Although I wouldn't be against what you proposed.

> Here's the current version of the patch.

That looks fine to me.

Simon
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 0/3] Fix crash if connection drops in scoped_restore_current_thread's ctor

Pedro Alves-2
In reply to this post by Pedro Alves-2
On 7/9/20 12:31 AM, Pedro Alves wrote:

> (I have internet again: found a sim card of a different operator that
> works.  This will do until the communications tower near me is
> repaired and get I fiber back...)
>
> This series fixes the crashes exposed by the
> gdb.multi/multi-target.exp testcase when run against an Asan-enabled
> GDB build, initially reported by Simon here:
>
>   https://sourceware.org/pipermail/gdb-patches/2020-July/170222.html
>
> The first two patches fix the crashes, and we should probably put them
> in GDB 10.
>
> The last patch is a follow up that avoids swallowing exceptions in
> scoped_restore_current_thread's dtor that I'm thinking would be a bit
> too invasive to put in GDB 10, I think it could do with a longer
> baking period in master.
>
> Pedro Alves (3):
>   Fix crash if connection drops in scoped_restore_current_thread's ctor,
>     part 1
>   Fix crash if connection drops in scoped_restore_current_thread's ctor,
>     part 2
>   Make scoped_restore_current_thread's cdtors exception free (RFC)

I've now merged patches 1 and 2.  Patch 3 will wait until after the branch
is cut.
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 0/3] Fix crash if connection drops in scoped_restore_current_thread's ctor

Simon Marchi-4
On 2020-07-10 7:02 p.m., Pedro Alves wrote:

> On 7/9/20 12:31 AM, Pedro Alves wrote:
>> (I have internet again: found a sim card of a different operator that
>> works.  This will do until the communications tower near me is
>> repaired and get I fiber back...)
>>
>> This series fixes the crashes exposed by the
>> gdb.multi/multi-target.exp testcase when run against an Asan-enabled
>> GDB build, initially reported by Simon here:
>>
>>   https://sourceware.org/pipermail/gdb-patches/2020-July/170222.html
>>
>> The first two patches fix the crashes, and we should probably put them
>> in GDB 10.
>>
>> The last patch is a follow up that avoids swallowing exceptions in
>> scoped_restore_current_thread's dtor that I'm thinking would be a bit
>> too invasive to put in GDB 10, I think it could do with a longer
>> baking period in master.
>>
>> Pedro Alves (3):
>>   Fix crash if connection drops in scoped_restore_current_thread's ctor,
>>     part 1
>>   Fix crash if connection drops in scoped_restore_current_thread's ctor,
>>     part 2
>>   Make scoped_restore_current_thread's cdtors exception free (RFC)
>
> I've now merged patches 1 and 2.  Patch 3 will wait until after the branch
> is cut.
>
I now see this other ASan failure when running gdb.multi/multi-target.exp, it's in the
attached asan.log.  There are colors, so it's easier to read if you "cat" it in your
terminal.  It looks familiar, because it happens in scoped_restore_current_thread's dtor
(not ctor), but maybe it just happens to be there but could happen at any other point.

It happens when starting test_continue with non-stop on, just after having completed
test_continue with non-stop off.  It's when GDB does "monitor exit".

Unfortunately, the "freed by thread T0 here" stack trace is again truncated, probably
because the stack is too deep for the portion of the stack ASan captures.  But I managed
to attach to GDB with GDB using gdb_interact and capture it (I broke on unpush_and_perror),
here's the equivalent GDB backtrace:

#0  xfree<void> (ptr=0x621004a5d900) at /home/smarchi/src/binutils-gdb/gdb/../gdbsupport/common-utils.h:63
#1  0x0000000001626260 in call_freefun (h=0x20f8da0 <frame_cache_obstack>, old_chunk=0x621004a5d900) at /home/smarchi/src/binutils-gdb/libiberty/obstack.c:103
#2  0x0000000001626c87 in _obstack_free (h=0x20f8da0 <frame_cache_obstack>, obj=0x0) at /home/smarchi/src/binutils-gdb/libiberty/obstack.c:280
#3  0x000000000098ae26 in reinit_frame_cache () at /home/smarchi/src/binutils-gdb/gdb/frame.c:1856
#4  0x0000000001098adf in switch_to_no_thread () at /home/smarchi/src/binutils-gdb/gdb/thread.c:1301
#5  0x0000000000acf544 in switch_to_inferior_no_thread (inf=0x615000244d00) at /home/smarchi/src/binutils-gdb/gdb/inferior.c:626
#6  0x0000000000e7c38c in remote_unpush_target (target=0x6170000c0c00) at /home/smarchi/src/binutils-gdb/gdb/remote.c:5521
#7  0x0000000000e92db6 in unpush_and_perror (target=0x6170000c0c00, string=0x191d400 "Remote communication error.  Target disconnected.") at /home/smarchi/src/binutils-gdb/gdb/remote.c:9101
#8  0x0000000000e930c7 in remote_target::readchar (this=0x6170000c0c00, timeout=2) at /home/smarchi/src/binutils-gdb/gdb/remote.c:9141
#9  0x0000000000e9576f in remote_target::getpkt_or_notif_sane_1 (this=0x6170000c0c00, buf=0x6170000c0c18, forever=0, expecting_notif=0, is_notif=0x0) at /home/smarchi/src/binutils-gdb/gdb/remote.c:9683
#10 0x0000000000e961c9 in remote_target::getpkt_sane (this=0x6170000c0c00, buf=0x6170000c0c18, forever=0) at /home/smarchi/src/binutils-gdb/gdb/remote.c:9790
#11 0x0000000000e95545 in remote_target::getpkt (this=0x6170000c0c00, buf=0x6170000c0c18, forever=0) at /home/smarchi/src/binutils-gdb/gdb/remote.c:9623
#12 0x0000000000e91ba3 in remote_target::remote_read_bytes_1 (this=0x6170000c0c00, memaddr=0x7ffff78bc38d, myaddr=0x7fff7dca59a0 "", len_units=1, unit_size=1, xfered_len_units=0x7fff7dca58b0) at /home/smarchi/src/binutils-gdb/gdb/remote.c:8860
#13 0x0000000000e9240c in remote_target::remote_read_bytes (this=0x6170000c0c00, memaddr=0x7ffff78bc38d, myaddr=0x7fff7dca59a0 "", len=1, unit_size=1, xfered_len=0x7fff7dca58b0) at /home/smarchi/src/binutils-gdb/gdb/remote.c:8987
#14 0x0000000000e9b821 in remote_target::xfer_partial (this=0x6170000c0c00, object=TARGET_OBJECT_MEMORY, annex=0x0, readbuf=0x7fff7dca59a0 "", writebuf=0x0, offset=140737346519949, len=1, xfered_len=0x7fff7dca58b0) at /home/smarchi/src/binutils-gdb/gdb/remote.c:10987
#15 0x000000000104fd3a in raw_memory_xfer_partial (ops=0x6170000c0c00, readbuf=0x7fff7dca59a0 "", writebuf=0x0, memaddr=140737346519949, len=1, xfered_len=0x7fff7dca58b0) at /home/smarchi/src/binutils-gdb/gdb/target.c:918
#16 0x0000000001050425 in memory_xfer_partial_1 (ops=0x6170000c0c00, object=TARGET_OBJECT_MEMORY, readbuf=0x7fff7dca59a0 "", writebuf=0x0, memaddr=140737346519949, len=1, xfered_len=0x7fff7dca58b0) at /home/smarchi/src/binutils-gdb/gdb/target.c:1047
#17 0x0000000001050608 in memory_xfer_partial (ops=0x6170000c0c00, object=TARGET_OBJECT_MEMORY, readbuf=0x7fff7dca59a0 "", writebuf=0x0, memaddr=140737346519949, len=1, xfered_len=0x7fff7dca58b0) at /home/smarchi/src/binutils-gdb/gdb/target.c:1076
#18 0x0000000001050b92 in target_xfer_partial (ops=0x6170000c0c00, object=TARGET_OBJECT_MEMORY, annex=0x0, readbuf=0x7fff7dca59a0 "", writebuf=0x0, offset=140737346519949, len=1, xfered_len=0x7fff7dca58b0) at /home/smarchi/src/binutils-gdb/gdb/target.c:1133
#19 0x0000000001051a7b in target_read_partial (ops=0x6170000c0c00, object=TARGET_OBJECT_MEMORY, annex=0x0, buf=0x7fff7dca59a0 "", offset=140737346519949, len=1, xfered_len=0x7fff7dca58b0) at /home/smarchi/src/binutils-gdb/gdb/target.c:1379
#20 0x0000000001051c59 in target_read (ops=0x6170000c0c00, object=TARGET_OBJECT_MEMORY, annex=0x0, buf=0x7fff7dca59a0 "", offset=140737346519949, len=1) at /home/smarchi/src/binutils-gdb/gdb/target.c:1419
#21 0x0000000001051178 in target_read_memory (memaddr=0x7ffff78bc38d, myaddr=0x7fff7dca59a0 "", len=1) at /home/smarchi/src/binutils-gdb/gdb/target.c:1222
#22 0x00000000004b4731 in amd64_stack_frame_destroyed_p (gdbarch=0x6210027e8510, pc=0x7ffff78bc38d) at /home/smarchi/src/binutils-gdb/gdb/amd64-tdep.c:2909
#23 0x00000000004b4822 in amd64_epilogue_frame_sniffer (self=0x169df00 <amd64_epilogue_frame_unwind>, this_frame=0x621004a5d9e0, this_prologue_cache=0x621004a5d9f8) at /home/smarchi/src/binutils-gdb/gdb/amd64-tdep.c:2924
#24 0x0000000000981048 in frame_unwind_try_unwinder (this_frame=0x621004a5d9e0, this_cache=0x621004a5d9f8, unwinder=0x169df00 <amd64_epilogue_frame_unwind>) at /home/smarchi/src/binutils-gdb/gdb/frame-unwind.c:128
#25 0x000000000098126d in frame_unwind_find_by_frame (this_frame=0x621004a5d9e0, this_cache=0x621004a5d9f8) at /home/smarchi/src/binutils-gdb/gdb/frame-unwind.c:186
#26 0x0000000000983c9d in compute_frame_id (fi=0x621004a5d9e0) at /home/smarchi/src/binutils-gdb/gdb/frame.c:546
#27 0x0000000000984167 in get_frame_id (fi=0x621004a5d9e0) at /home/smarchi/src/binutils-gdb/gdb/frame.c:582
#28 0x0000000001098eef in restore_selected_frame (a_frame_id=..., frame_level=0) at /home/smarchi/src/binutils-gdb/gdb/thread.c:1355
#29 0x00000000010992f8 in scoped_restore_current_thread::restore (this=0x7fff7dca5f30) at /home/smarchi/src/binutils-gdb/gdb/thread.c:1411
#30 0x0000000001099355 in scoped_restore_current_thread::~scoped_restore_current_thread (this=0x7fff7dca5f30, __in_chrg=<optimized out>) at /home/smarchi/src/binutils-gdb/gdb/thread.c:1420
#31 0x0000000000aeab84 in do_target_wait (wait_ptid=..., ecs=0x7fff7dca6290, options=1) at /home/smarchi/src/binutils-gdb/gdb/infrun.c:3670
#32 0x0000000000aecbe3 in fetch_inferior_event () at /home/smarchi/src/binutils-gdb/gdb/infrun.c:3965
#33 0x0000000000aa8097 in inferior_event_handler (event_type=INF_REG_EVENT) at /home/smarchi/src/binutils-gdb/gdb/inf-loop.c:42
#34 0x0000000000eab8b7 in remote_async_inferior_event_handler (data=0x6170000d6a00) at /home/smarchi/src/binutils-gdb/gdb/remote.c:14166
#35 0x00000000004ca110 in check_async_event_handlers () at /home/smarchi/src/binutils-gdb/gdb/async-event.c:295
#36 0x00000000015bef41 in gdb_do_one_event () at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:194
#37 0x0000000000bfd50e in start_event_loop () at /home/smarchi/src/binutils-gdb/gdb/main.c:356
#38 0x0000000000bfd816 in captured_command_loop () at /home/smarchi/src/binutils-gdb/gdb/main.c:416
#39 0x0000000000c00c25 in captured_main (data=0x7fff7dca65d0) at /home/smarchi/src/binutils-gdb/gdb/main.c:1253
#40 0x0000000000c00cb5 in gdb_main (args=0x7fff7dca65d0) at /home/smarchi/src/binutils-gdb/gdb/main.c:1268
#41 0x0000000000414d9e in main (argc=5, argv=0x7fff7dca6738) at /home/smarchi/src/binutils-gdb/gdb/gdb.c:32


The problem seems to be:

- We create a new frame_info object in restore_selected_frame (by calling find_relative_frame)
- The frame is allocated on the frame_cache_obstack
- In frame_unwind_try_unwinder, we try to find an unwinder for that frame
- While trying unwinders, memory read fails because the remote target closes, because of "monitor exit"
- That calls reinit_frame_cache (as shown above), which resets frame_cache_obstack
- When handling the exception in frame_unwind_try_unwinder, we try to set some things on the frame_info
  object (like *this_cache, which in fact tries to write into frame_info::prologue_cache), but the
  frame_info object is no more, it went away with the obstack.

Simon

asan.log (12K) Download Attachment
12