[RFC][PATCH] fix gdb segv when objfile can't be opened

classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

[RFC][PATCH] fix gdb segv when objfile can't be opened

Mike Gulick
This fixes PR 16577.

This patch changes gdb_bfd_map_section to issue a warning rather than an
error if it is unable to read the object file, and sets the size of the
section/frame that it attempted to read to 0 on error.

The description of gdb_bfd_map_section states that it will try to read
or map the contents of the section SECT, and if successful, the section
data is returned and *size is set to the size of the section data.  This
function was throwing an error and leaving *size as-is.  Setting the
section size to 0 indicates to dwarf2_build_frame_info that there is no
data to read, otherwise it will try to read from an invalid frame
pointer.

Changing the error to a warning allows this to be handled gracefully.
Additionally, the error was clobbering the breakpoint output indicating
the current frame (function name, arguments, source file, and line number).
E.g.

Thread 3 "foo" hit Breakpoint 1, BFD: reopening
/tmp/jna-1013829440/jna2973250704389291330.tmp: No such file or directory

BFD: reopening /tmp/jna-1013829440/jna2973250704389291330.tmp: No such
file or directory

(gdb)

While the "BFD: reopening ..." messages will still appear interspersed
in the
breakpoint output, the current frame info is now displayed:

Thread 3 "foo" hit Breakpoint 1, BFD: reopening
/tmp/jna-1013829440/jna1875755897659885075.tmp: No such file or directory

BFD: reopening /tmp/jna-1013829440/jna1875755897659885075.tmp: No such
file or directory

warning: Can't read data for section '.eh_frame' in file
'/tmp/jna-1013829440/jna1875755897659885075.tmp'
do_something () at file.cpp:80
80    {
(gdb)
---
 gdb/gdb_bfd.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/gdb/gdb_bfd.c b/gdb/gdb_bfd.c
index 29080b8..229f5ae 100644
--- a/gdb/gdb_bfd.c
+++ b/gdb/gdb_bfd.c
@@ -705,9 +705,15 @@ gdb_bfd_map_section (asection *sectp, bfd_size_type
*size)
 
   data = NULL;
   if (!bfd_get_full_section_contents (abfd, sectp, &data))
-    error (_("Can't read data for section '%s' in file '%s'"),
-       bfd_get_section_name (abfd, sectp),
-       bfd_get_filename (abfd));
+    {
+      warning (_("Can't read data for section '%s' in file '%s'"),
+           bfd_get_section_name (abfd, sectp),
+           bfd_get_filename (abfd));
+      /* Section is invalid -- set size to 0 and return NULL */
+      descriptor->size = 0;
+      *size = descriptor->size;
+      return (const gdb_byte *) NULL;
+    }
   descriptor->data = data;
 
  done:
--
2.1.4

Reply | Threaded
Open this post in threaded view
|

Re: [RFC][PATCH] fix gdb segv when objfile can't be opened

Mike Gulick
I apologize for the improperly formatted patch -- I'm really struggling
to get thunderbird to behave as I want.

Here is an updated patch.  I would have sent it with git send-email, but
I could not figure out the proper way to add this preface before the
patch (without it looking like part of the commit message).

---
From 5dee04076518554e4baae864569d6f4faee9b685 Mon Sep 17 00:00:00 2001
From: Mike Gulick <[hidden email]>
Date: Wed, 18 Oct 2017 16:04:27 -0400
Subject: [PATCH] fix gdb segv when objfile can't be opened

This fixes PR 16577.

This patch changes gdb_bfd_map_section to issue a warning rather than an
error if it is unable to read the object file, and sets the size of the
section/frame that it attempted to read to 0 on error.

The description of gdb_bfd_map_section states that it will try to read
or map the contents of the section SECT, and if successful, the section
data is returned and *SIZE is set to the size of the section data.  This
function was throwing an error and leaving *size as-is.  Setting the
section size to 0 indicates to dwarf2_build_frame_info that there is no
data to read, otherwise it will try to read from an invalid frame
pointer.

Changing the error to a warning allows this to be handled gracefully.
Additionally, the error was clobbering the breakpoint output indicating
the current frame (function name, arguments, source file, and line number).
E.g.

Thread 3 "foo" hit Breakpoint 1, BFD: reopening /tmp/jna-1013829440/jna2973250704389291330.tmp: No such file or directory

BFD: reopening /tmp/jna-1013829440/jna2973250704389291330.tmp: No such file or directory

(gdb)

While the "BFD: reopening ..." messages will still appear interspersed in the
breakpoint output, the current frame info is now displayed:

Thread 3 "foo" hit Breakpoint 1, BFD: reopening /tmp/jna-1013829440/jna1875755897659885075.tmp: No such file or directory

BFD: reopening /tmp/jna-1013829440/jna1875755897659885075.tmp: No such file or directory

warning: Can't read data for section '.eh_frame' in file '/tmp/jna-1013829440/jna1875755897659885075.tmp'
do_something () at file.cpp:80
80 {
(gdb)
---
 gdb/gdb_bfd.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/gdb/gdb_bfd.c b/gdb/gdb_bfd.c
index 29080b8..229f5ae 100644
--- a/gdb/gdb_bfd.c
+++ b/gdb/gdb_bfd.c
@@ -705,9 +705,15 @@ gdb_bfd_map_section (asection *sectp, bfd_size_type *size)
 
   data = NULL;
   if (!bfd_get_full_section_contents (abfd, sectp, &data))
-    error (_("Can't read data for section '%s' in file '%s'"),
-   bfd_get_section_name (abfd, sectp),
-   bfd_get_filename (abfd));
+    {
+      warning (_("Can't read data for section '%s' in file '%s'"),
+       bfd_get_section_name (abfd, sectp),
+       bfd_get_filename (abfd));
+      /* Section is invalid -- set size to 0 and return NULL */
+      descriptor->size = 0;
+      *size = descriptor->size;
+      return (const gdb_byte *) NULL;
+    }
   descriptor->data = data;
 
  done:
--
2.1.4

Reply | Threaded
Open this post in threaded view
|

Re: [RFC][PATCH] fix gdb segv when objfile can't be opened

Simon Marchi
On 2017-10-19 11:59, Mike Gulick wrote:
> I apologize for the improperly formatted patch -- I'm really struggling
> to get thunderbird to behave as I want.
>
> Here is an updated patch.  I would have sent it with git send-email,
> but
> I could not figure out the proper way to add this preface before the
> patch (without it looking like part of the commit message).

Hi Mike,

Thanks, I was able to apply this version correctly.

If I have a short comment that's not meant to be in the commit message,
I usually
include it in brackets like this:

   [Re-sending this patch because the first try was not formatted
correctly.]

If it's longer you can always end it with a line "Actual commit
message:".  Either way, it's not really a big deal, as long as it's
clear.  You can use the --annotate option of git-send-email to edit the
message before sending it.

> ---
> From 5dee04076518554e4baae864569d6f4faee9b685 Mon Sep 17 00:00:00 2001
> From: Mike Gulick <[hidden email]>
> Date: Wed, 18 Oct 2017 16:04:27 -0400
> Subject: [PATCH] fix gdb segv when objfile can't be opened
>
> This fixes PR 16577.
>
> This patch changes gdb_bfd_map_section to issue a warning rather than
> an
> error if it is unable to read the object file, and sets the size of the
> section/frame that it attempted to read to 0 on error.
>
> The description of gdb_bfd_map_section states that it will try to read
> or map the contents of the section SECT, and if successful, the section
> data is returned and *SIZE is set to the size of the section data.  
> This
> function was throwing an error and leaving *size as-is.  Setting the
> section size to 0 indicates to dwarf2_build_frame_info that there is no
> data to read, otherwise it will try to read from an invalid frame
> pointer.
>
> Changing the error to a warning allows this to be handled gracefully.
> Additionally, the error was clobbering the breakpoint output indicating
> the current frame (function name, arguments, source file, and line
> number).
> E.g.
>
> Thread 3 "foo" hit Breakpoint 1, BFD: reopening
> /tmp/jna-1013829440/jna2973250704389291330.tmp: No such file or
> directory
>
> BFD: reopening /tmp/jna-1013829440/jna2973250704389291330.tmp: No such
> file or directory

For some reason, I am not able to reproduce the crash using the
instructions in the bug report, and gdb master.

(gdb) up
#1  0x00007ffff78d525a in sleep () from /lib/x86_64-linux-gnu/libc.so.6
(gdb)
BFD: reopening ./badlib.so: No such file or directory

BFD: reopening ./badlib.so: No such file or directory

Can't read data for section '.eh_frame' in file './badlib.so'
(gdb)
Initial frame selected; you cannot go up.
(gdb)
Initial frame selected; you cannot go up.
(gdb)
Initial frame selected; you cannot go up.
(gdb) bt
#0  0x00007ffff78d52f0 in nanosleep () from
/lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff78d525a in sleep () from /lib/x86_64-linux-gnu/libc.so.6


Would you be able to create a test case to reproduce it?  We would need
one to go in with the fix in the end anyway, and it's easier for
reviewers if they can just run a test file rather than try to reproduce
by hand.  You can start by copying an existing solib test, like
gdb.base/solib-display.exp.  See here for more details about tests:

http://sourceware.org/gdb/wiki/TestingGDB
http://sourceware.org/gdb/wiki/GDBTestcaseCookbook

Don't hesitate to ask here or on IRC if you need assistance.

> (gdb)
>
> While the "BFD: reopening ..." messages will still appear interspersed
> in the
> breakpoint output, the current frame info is now displayed:
>
> Thread 3 "foo" hit Breakpoint 1, BFD: reopening
> /tmp/jna-1013829440/jna1875755897659885075.tmp: No such file or
> directory
>
> BFD: reopening /tmp/jna-1013829440/jna1875755897659885075.tmp: No such
> file or directory
>
> warning: Can't read data for section '.eh_frame' in file
> '/tmp/jna-1013829440/jna1875755897659885075.tmp'
> do_something () at file.cpp:80
> 80 {
> (gdb)
> ---
>  gdb/gdb_bfd.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/gdb/gdb_bfd.c b/gdb/gdb_bfd.c
> index 29080b8..229f5ae 100644
> --- a/gdb/gdb_bfd.c
> +++ b/gdb/gdb_bfd.c
> @@ -705,9 +705,15 @@ gdb_bfd_map_section (asection *sectp,
> bfd_size_type *size)
>
>    data = NULL;
>    if (!bfd_get_full_section_contents (abfd, sectp, &data))
> -    error (_("Can't read data for section '%s' in file '%s'"),
> -   bfd_get_section_name (abfd, sectp),
> -   bfd_get_filename (abfd));
> +    {
> +      warning (_("Can't read data for section '%s' in file '%s'"),
> +       bfd_get_section_name (abfd, sectp),
> +       bfd_get_filename (abfd));
> +      /* Section is invalid -- set size to 0 and return NULL */
> +      descriptor->size = 0;
> +      *size = descriptor->size;
> +      return (const gdb_byte *) NULL;
> +    }
>    descriptor->data = data;
>
>   done:

I don't know if it is really this function's responsibility to clear
*size in case of error, or it would be the callers responsibility to
properly check for errors.  But if the function doesn't throw anymore,
the comment in gdb_bfd.h should be updated accordingly.

Thanks,

Simon

Reply | Threaded
Open this post in threaded view
|

Re: [RFC][PATCH] fix gdb segv when objfile can't be opened

Mike Gulick
On 10/19/2017 01:54 PM, Simon Marchi wrote:

> On 2017-10-19 11:59, Mike Gulick wrote:
>> I apologize for the improperly formatted patch -- I'm really struggling
>> to get thunderbird to behave as I want.
>>
>> Here is an updated patch.  I would have sent it with git send-email, but
>> I could not figure out the proper way to add this preface before the
>> patch (without it looking like part of the commit message).
>
> Hi Mike,
>
> Thanks, I was able to apply this version correctly.
>
> If I have a short comment that's not meant to be in the commit message, I usually
> include it in brackets like this:
>
>   [Re-sending this patch because the first try was not formatted correctly.]
>
> If it's longer you can always end it with a line "Actual commit message:".  Either way, it's not really a big deal, as long as it's clear.  You can use the --annotate option of git-send-email to edit the message before sending it.
>
Thanks.

>> ---
>> From 5dee04076518554e4baae864569d6f4faee9b685 Mon Sep 17 00:00:00 2001
>> From: Mike Gulick <[hidden email]>
>> Date: Wed, 18 Oct 2017 16:04:27 -0400
>> Subject: [PATCH] fix gdb segv when objfile can't be opened
>>
>> This fixes PR 16577.
>>
>> This patch changes gdb_bfd_map_section to issue a warning rather than an
>> error if it is unable to read the object file, and sets the size of the
>> section/frame that it attempted to read to 0 on error.
>>
>> The description of gdb_bfd_map_section states that it will try to read
>> or map the contents of the section SECT, and if successful, the section
>> data is returned and *SIZE is set to the size of the section data.  This
>> function was throwing an error and leaving *size as-is.  Setting the
>> section size to 0 indicates to dwarf2_build_frame_info that there is no
>> data to read, otherwise it will try to read from an invalid frame
>> pointer.
>>
>> Changing the error to a warning allows this to be handled gracefully.
>> Additionally, the error was clobbering the breakpoint output indicating
>> the current frame (function name, arguments, source file, and line number).
>> E.g.
>>
>> Thread 3 "foo" hit Breakpoint 1, BFD: reopening
>> /tmp/jna-1013829440/jna2973250704389291330.tmp: No such file or
>> directory
>>
>> BFD: reopening /tmp/jna-1013829440/jna2973250704389291330.tmp: No such
>> file or directory
>
> For some reason, I am not able to reproduce the crash using the instructions in the bug report, and gdb master.
>
> (gdb) up
> #1  0x00007ffff78d525a in sleep () from /lib/x86_64-linux-gnu/libc.so.6
> (gdb)
> BFD: reopening ./badlib.so: No such file or directory
>
> BFD: reopening ./badlib.so: No such file or directory
>
> Can't read data for section '.eh_frame' in file './badlib.so'
> (gdb)
> Initial frame selected; you cannot go up.
> (gdb)
> Initial frame selected; you cannot go up.
> (gdb)
> Initial frame selected; you cannot go up.
> (gdb) bt
> #0  0x00007ffff78d52f0 in nanosleep () from /lib/x86_64-linux-gnu/libc.so.6
> #1  0x00007ffff78d525a in sleep () from /lib/x86_64-linux-gnu/libc.so.6
>
>
> Would you be able to create a test case to reproduce it?  We would need one to go in with the fix in the end anyway, and it's easier for reviewers if they can just run a test file rather than try to reproduce by hand.  You can start by copying an existing solib test, like gdb.base/solib-display.exp.  See here for more details about tests:
>
> http://sourceware.org/gdb/wiki/TestingGDB
> http://sourceware.org/gdb/wiki/GDBTestcaseCookbook
>
> Don't hesitate to ask here or on IRC if you need assistance.
>

I attached a new reproducer in the bug report that reproduces this
problem in the current git master (and more closely follows the problem
I was seeing in our large C++/Java application).  This causes the error
messages to be emitted when the breakpoint is hit, and the stepping
forward triggers the segfault.

I'll take a look at building a test case.

>> (gdb)
>>
>> While the "BFD: reopening ..." messages will still appear interspersed in the
>> breakpoint output, the current frame info is now displayed:
>>
>> Thread 3 "foo" hit Breakpoint 1, BFD: reopening
>> /tmp/jna-1013829440/jna1875755897659885075.tmp: No such file or
>> directory
>>
>> BFD: reopening /tmp/jna-1013829440/jna1875755897659885075.tmp: No such
>> file or directory
>>
>> warning: Can't read data for section '.eh_frame' in file
>> '/tmp/jna-1013829440/jna1875755897659885075.tmp'
>> do_something () at file.cpp:80
>> 80    {
>> (gdb)
>> ---
>>  gdb/gdb_bfd.c | 12 +++++++++---
>>  1 file changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/gdb/gdb_bfd.c b/gdb/gdb_bfd.c
>> index 29080b8..229f5ae 100644
>> --- a/gdb/gdb_bfd.c
>> +++ b/gdb/gdb_bfd.c
>> @@ -705,9 +705,15 @@ gdb_bfd_map_section (asection *sectp, bfd_size_type *size)
>>
>>    data = NULL;
>>    if (!bfd_get_full_section_contents (abfd, sectp, &data))
>> -    error (_("Can't read data for section '%s' in file '%s'"),
>> -       bfd_get_section_name (abfd, sectp),
>> -       bfd_get_filename (abfd));
>> +    {
>> +      warning (_("Can't read data for section '%s' in file '%s'"),
>> +           bfd_get_section_name (abfd, sectp),
>> +           bfd_get_filename (abfd));
>> +      /* Section is invalid -- set size to 0 and return NULL */
>> +      descriptor->size = 0;
>> +      *size = descriptor->size;
>> +      return (const gdb_byte *) NULL;
>> +    }
>>    descriptor->data = data;
>>
>>   done:
>
> I don't know if it is really this function's responsibility to clear *size in case of error, or it would be the callers responsibility to properly check for errors.  But if the function doesn't throw anymore, the comment in gdb_bfd.h should be updated accordingly.

I had trouble figuring out what the 'error' function actually does (I
couldn't find where it was defined).  When I'm stepping through GDB in
the debugger, the lines past 'error' never seem to get called.  It's
like 'error' throws an exception that is caught elsewhere.  I was also
unable to figure out why the error message isn't displayed.  The new
reproducer shows this issue.  I wasn't sure if setting *size or even
descriptor->size was the right thing to do, but it seemed reasonable to
me since the comment in gdb_bfd.h states that this function updates
*size.  There's currently only one caller for 'gdb_bfd_map_section', so
I have no problem updating *size there if that is preferred.

Thanks,
Mike
Reply | Threaded
Open this post in threaded view
|

Re: [RFC][PATCH] fix gdb segv when objfile can't be opened

Simon Marchi
On 2017-10-19 15:39, Mike Gulick wrote:
> I had trouble figuring out what the 'error' function actually does (I
> couldn't find where it was defined).  When I'm stepping through GDB in
> the debugger, the lines past 'error' never seem to get called.  It's
> like 'error' throws an exception that is caught elsewhere.

Indeed, "error" throws an exception.  You should be able at least to
step into the error function (although it's not particularly useful nor
interesting).  It is defined in common/errors.c.

> I was also
> unable to figure out why the error message isn't displayed.  The new
> reproducer shows this issue.  I wasn't sure if setting *size or even
> descriptor->size was the right thing to do, but it seemed reasonable to
> me since the comment in gdb_bfd.h states that this function updates
> *size.  There's currently only one caller for 'gdb_bfd_map_section', so
> I have no problem updating *size there if that is preferred.

Actually it says "If successful, the section data is returned and *SIZE
is set to the size of the section data;".  And this is what I would
generally expect from functions.  Unless stated otherwise, the value of
output parameters should be considered undefined if the function failed.
  So I would lean towards blaming the caller for not taking enough
precautions.  It trusts that gdb_bfd_map_section won't fail.

Simon
Reply | Threaded
Open this post in threaded view
|

Re: [RFC][PATCH] fix gdb segv when objfile can't be opened

Mike Gulick
On 10/19/2017 04:09 PM, Simon Marchi wrote:
> Indeed, "error" throws an exception.  You should be able at least to step into the error function (although it's not particularly useful nor interesting).  It is defined in common/errors.c.
>

Thanks.  I looked at the stack when stopped on 'error', and I see that
there is a TRY/CATCH in 'frame_unwind_try_unwinder'.  That doesn't
handle the exception, and just re-throws it.  It looks like the
exception is ultimately caught in 'print_stack_frame', which explains
why the stack frame is never printed when the exception is thrown
(although it's still unclear why the message passed to 'error' isn't
printed).

Here is the top half of the stack:

#0  gdb_bfd_map_section (sectp=0x31e4eb8, size=0x31ddc58) at gdb_bfd.c:708
#1  0x0000000000628b15 in dwarf2_read_section (objfile=0x31c0be0, info=0x31ddc48) at dwarf2read.c:2553
#2  0x0000000000628e8b in dwarf2_get_section_info (objfile=0x31c0be0, sect=DWARF2_EH_FRAME, sectp=0x31ddf20, bufp=0x31ddf10, sizep=0x31ddf18) at dwarf2read.c:2634
#3  0x0000000000611616 in dwarf2_build_frame_info (objfile=0x31c0be0) at dwarf2-frame.c:2222
#4  0x0000000000610262 in dwarf2_frame_find_fde (pc=0x7ffd1fc37ad0, out_offset=0x0) at dwarf2-frame.c:1707
#5  0x000000000060f874 in dwarf2_frame_sniffer (self=0xa2c480 <dwarf2_frame_unwind>, this_frame=0x2fceb60, this_cache=0x2fceb78) at dwarf2-frame.c:1337
#6  0x00000000006913cc in frame_unwind_try_unwinder (this_frame=0x2fceb60, this_cache=0x2fceb78, unwinder=0xa2c480 <dwarf2_frame_unwind>) at frame-unwind.c:106
#7  0x0000000000691598 in frame_unwind_find_by_frame (this_frame=0x2fceb60, this_cache=0x2fceb78) at frame-unwind.c:164
#8  0x000000000068fe3c in get_frame_type (frame=0x2fceb60) at frame.c:2625
#9  0x000000000077404e in print_frame_info (frame=0x2fceb60, print_level=0, print_what=SRC_AND_LOC, print_args=1, set_current_sal=1) at stack.c:795
#10 0x00000000007729b0 in print_stack_frame (frame=0x2fceb60, print_level=0, print_what=SRC_AND_LOC, set_current_sal=1) at stack.c:177
#11 0x00000000006d3ee5 in print_stop_location (ws=0x7ffd1fc37db0) at infrun.c:8041
#12 0x00000000006d3f5b in print_stop_event (uiout=0x30f1900) at infrun.c:8058
...

>> I was also
>> unable to figure out why the error message isn't displayed.  The new
>> reproducer shows this issue.  I wasn't sure if setting *size or even
>> descriptor->size was the right thing to do, but it seemed reasonable to
>> me since the comment in gdb_bfd.h states that this function updates
>> *size.  There's currently only one caller for 'gdb_bfd_map_section', so
>> I have no problem updating *size there if that is preferred.
>
> Actually it says "If successful, the section data is returned and *SIZE is set to the size of the section data;".  And this is what I would generally expect from functions.  Unless stated otherwise, the value of output parameters should be considered undefined if the function failed.  So I would lean towards blaming the caller for not taking enough precautions.  It trusts that gdb_bfd_map_section won't fail.
>

I'm not sure how the gdb_bfd_map_section caller can pre-determine that
it will fail.  It looks like there may be situations where
gdb_bfd_map_section doesn't actually need to read the file, so that
would mean that simply checking if the object file is readable before
calling gdb_bfd_map_section might not be a good way to address this.
Outside of that, I don't see what pre-conditions I can check to
determine if gdb_bfd_map_section is going to fail in this case.  Do you
have any suggestions?

To add a little more info, the segfault happens after trying to run
'next' in the debugger after this error is thrown.  Here is a snippet of
the stack:

#0  0x000000000083b3e1 in bfd_getl32 (p=0x0) at libbfd.c:557
#1  0x000000000060fb3e in read_initial_length (abfd=0x31805e0, buf=0x0, bytes_read_ptr=0x7ffd1fc3797c) at dwarf2-frame.c:1482
#2  0x0000000000610592 in decode_frame_entry_1 (unit=0x31ddf40, start=0x0, eh_frame_p=1, cie_table=0x7ffd1fc37b00, fde_table=0x7ffd1fc37af0, entry_type=EH_CIE_OR_FDE_TYPE_ID) at dwarf2-frame.c:1792
#3  0x00000000006111b5 in decode_frame_entry (unit=0x31ddf40, start=0x0, eh_frame_p=1, cie_table=0x7ffd1fc37b00, fde_table=0x7ffd1fc37af0, entry_type=EH_CIE_OR_FDE_TYPE_ID) at dwarf2-frame.c:2090
#4  0x00000000006116d1 in dwarf2_build_frame_info (objfile=0x31c0be0) at dwarf2-frame.c:2247
#5  0x0000000000610262 in dwarf2_frame_find_fde (pc=0x7ffd1fc37d30, out_offset=0x2fcec58) at dwarf2-frame.c:1707
#6  0x000000000060ead3 in dwarf2_frame_cache (this_frame=0x2fceb60, this_cache=0x2fceb78) at dwarf2-frame.c:1000
#7  0x000000000060f2d7 in dwarf2_frame_this_id (this_frame=0x2fceb60, this_cache=0x2fceb78, this_id=0x2fcebc0) at dwarf2-frame.c:1198
#8  0x000000000068baab in compute_frame_id (fi=0x2fceb60) at frame.c:505
#9  0x000000000068bbf7 in get_frame_id (fi=0x2fceb60) at frame.c:537
#10 0x00000000006bee4e in step_command_fsm_prepare (sm=0x3175e10, skip_subroutines=1, single_inst=0, count=1, thread=0x3162ac0) at infcmd.c:1056
#11 0x00000000006bef73 in step_1 (skip_subroutines=1, single_inst=0, count_string=0x0) at infcmd.c:1096
#12 0x00000000006bed3f in next_command (count_string=0x0, from_tty=1) at infcmd.c:969
...

dwarf2_build_frame_info first checks if unit->dwarf_frame_size is
non-zero (it is), then it calls decode_frame_entry on
unit->dwarf_frame_buffer.  unit->dwarf_frame_buffer is null, which
triggers the segfault.  Adding checks in dwarf2_build_frame_info to
check that dwarf_frame_buffer is non-zero (which needs to be done in two
places) also prevents the crash.  However, this doesn't address the
issue that the stack frame info isn't printed when the initial
breakpoint is hit (because gdb_bfd_map_section throws an error).  It
would be nice if this could be gracefully handled so that the current
frame and source code line are printed instead of just dumping you back
to a (gdb) prompt without knowing where you are.  Since my patch changes
the error to a warning, and updates the size pointer, it fixes both of
these issues.

I can imagine two alternatives:

- Change the error to a warning, and make the caller of
  gdb_bfd_map_section check if the returned pointer is NULL, and if so
  set *size to 0.  This requires the ugly-ish early return that I had to
  add in gdb_bfd_map_section.
 
- Add a TRY/CATCH around dwarf2_read_section in dwarf2_get_section_info,
  that sets *sectp, *bufp, and *sizep to NULL.  I'm not sure if I should
  then re-throw the exception.

Thanks,
Mike
Reply | Threaded
Open this post in threaded view
|

Re: [RFC][PATCH] fix gdb segv when objfile can't be opened

Mike Gulick
In reply to this post by Simon Marchi
Here is a new version of the patch.  It changes the following from the original
patch:

1) It adds a testsuite which locks down:
  a) The stack frame is printed when the breakpoint is hit
  b) 'next' runs without causing a segv
2) No longer set descriptor->size in gdb_bfd_map_section when an error is hit.
   It doesn't look like this is needed as gdb_bfd_map_section is the only place
   that consumes that descriptor->size field.
3) Update the description in gdb_bfd.h to indicate that if there is an error
   mapping the section, a warning is generated, *SIZE is set to 0, and NULL is
   returned.

I considered catching the error in the caller of gdb_bfd_map_section, which is
dwarf2_read_section in dwarf2read.c.  It didn't seem to make any more sense to
catch the error here than it would directly in gdb_bfd_map_section.  If
anything, it would make more sense to catch the error in the caller of
dwarf2_read_section, which in this case is dwarf2_get_section_info.  However
there are many callers of dwarf2_read_section.  It doesn't make sense to force
all of them to add a TRY/CATCH around calling dwarf2_read_section.

Please let me know thoughts/feedback.

Also see the note in the test case about not being able to make 'shlib=...'
work when compiling a shared library that links against another shared
library.  The workaround I used is kind of ugly.

Thanks,
Mike
---

From d59bb69db0f10f8cb7d91568907b505f05424df7 Mon Sep 17 00:00:00 2001
From: Mike Gulick <[hidden email]>
Date: Wed, 18 Oct 2017 16:04:27 -0400
Subject: [PATCH v2] fix gdb segv when objfile can't be opened

This fixes PR 16577.

This patch changes gdb_bfd_map_section to issue a warning rather than an
error if it is unable to read the object file, and sets the size of the
section/frame that it attempted to read to 0 on error.

The description of gdb_bfd_map_section states that it will try to read
or map the contents of the section SECT, and if successful, the section
data is returned and *SIZE is set to the size of the section data.  This
function was throwing an error and leaving *SIZE as-is.  Setting the
section size to 0 indicates to dwarf2_build_frame_info that there is no
data to read, otherwise it will try to read from an invalid frame
pointer.

Changing the error to a warning allows this to be handled gracefully.
Additionally, the error was clobbering the breakpoint output indicating
the current frame (function name, arguments, source file, and line number).
E.g.

Thread 3 "foo" hit Breakpoint 1, BFD: reopening /tmp/jna-1013829440/jna2973250704389291330.tmp: No such file or directory

BFD: reopening /tmp/jna-1013829440/jna2973250704389291330.tmp: No such file or directory

(gdb)

While the "BFD: reopening ..." messages will still appear interspersed in the
breakpoint output, the current frame info is now displayed:

Thread 3 "foo" hit Breakpoint 1, BFD: reopening /tmp/jna-1013829440/jna1875755897659885075.tmp: No such file or directory

BFD: reopening /tmp/jna-1013829440/jna1875755897659885075.tmp: No such file or directory

warning: Can't read data for section '.eh_frame' in file '/tmp/jna-1013829440/jna1875755897659885075.tmp'
do_something () at file.cpp:80
80 {
(gdb)
---
 gdb/gdb_bfd.c                              |  11 ++-
 gdb/gdb_bfd.h                              |  17 +++--
 gdb/testsuite/gdb.base/solib-vanish-lib1.c |  22 ++++++
 gdb/testsuite/gdb.base/solib-vanish-lib2.c |  20 ++++++
 gdb/testsuite/gdb.base/solib-vanish-main.c |  62 +++++++++++++++++
 gdb/testsuite/gdb.base/solib-vanish.exp    | 108 +++++++++++++++++++++++++++++
 6 files changed, 228 insertions(+), 12 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/solib-vanish-lib1.c
 create mode 100644 gdb/testsuite/gdb.base/solib-vanish-lib2.c
 create mode 100644 gdb/testsuite/gdb.base/solib-vanish-main.c
 create mode 100644 gdb/testsuite/gdb.base/solib-vanish.exp

diff --git a/gdb/gdb_bfd.c b/gdb/gdb_bfd.c
index cc02740..b33de07 100644
--- a/gdb/gdb_bfd.c
+++ b/gdb/gdb_bfd.c
@@ -702,9 +702,14 @@ gdb_bfd_map_section (asection *sectp, bfd_size_type *size)
 
   data = NULL;
   if (!bfd_get_full_section_contents (abfd, sectp, &data))
-    error (_("Can't read data for section '%s' in file '%s'"),
-   bfd_get_section_name (abfd, sectp),
-   bfd_get_filename (abfd));
+    {
+      warning (_("Can't read data for section '%s' in file '%s'"),
+       bfd_get_section_name (abfd, sectp),
+       bfd_get_filename (abfd));
+      /* Section is invalid -- set size to 0 and return NULL */
+      *size = 0;
+      return (const gdb_byte *) NULL;
+    }
   descriptor->data = data;
 
  done:
diff --git a/gdb/gdb_bfd.h b/gdb/gdb_bfd.h
index b1ff857..e427781 100644
--- a/gdb/gdb_bfd.h
+++ b/gdb/gdb_bfd.h
@@ -115,15 +115,14 @@ void gdb_bfd_mark_parent (bfd *child, bfd *parent);
 
 void gdb_bfd_record_inclusion (bfd *includer, bfd *includee);
 
-/* Try to read or map the contents of the section SECT.  If
-   successful, the section data is returned and *SIZE is set to the
-   size of the section data; this may not be the same as the size
-   according to bfd_get_section_size if the section was compressed.
-   The returned section data is associated with the BFD and will be
-   destroyed when the BFD is destroyed.  There is no other way to free
-   it; for temporary uses of section data, see
-   bfd_malloc_and_get_section.  SECT may not have relocations.  This
-   function will throw on error.  */
+/* Try to read or map the contents of the section SECT.  If successful, the
+   section data is returned and *SIZE is set to the size of the section data;
+   this may not be the same as the size according to bfd_get_section_size if the
+   section was compressed.  The returned section data is associated with the BFD
+   and will be destroyed when the BFD is destroyed.  There is no other way to
+   free it; for temporary uses of section data, see bfd_malloc_and_get_section.
+   SECT may not have relocations.  If there is an error reading the section,
+   this issues a warning, sets *SIZE to 0, and returns NULL.  */
 
 const gdb_byte *gdb_bfd_map_section (asection *section, bfd_size_type *size);
 
diff --git a/gdb/testsuite/gdb.base/solib-vanish-lib1.c b/gdb/testsuite/gdb.base/solib-vanish-lib1.c
new file mode 100644
index 0000000..f8e8182
--- /dev/null
+++ b/gdb/testsuite/gdb.base/solib-vanish-lib1.c
@@ -0,0 +1,22 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2017 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+extern int bar (int);
+
+int foo (int x) {
+  return bar (x);
+}
diff --git a/gdb/testsuite/gdb.base/solib-vanish-lib2.c b/gdb/testsuite/gdb.base/solib-vanish-lib2.c
new file mode 100644
index 0000000..7799d92
--- /dev/null
+++ b/gdb/testsuite/gdb.base/solib-vanish-lib2.c
@@ -0,0 +1,20 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2017 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+int bar (int y) {
+  return y + 1; /* break here */
+}
diff --git a/gdb/testsuite/gdb.base/solib-vanish-main.c b/gdb/testsuite/gdb.base/solib-vanish-main.c
new file mode 100644
index 0000000..0e21e9c
--- /dev/null
+++ b/gdb/testsuite/gdb.base/solib-vanish-main.c
@@ -0,0 +1,62 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2017 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <stdio.h>
+#include <dlfcn.h>
+#include <stdlib.h>
+
+#ifndef VANISH_LIB
+#define VANISH_LIB "./solib-vanish-lib1.so"
+#endif
+
+int main()
+{
+  void *handle;
+  int (*foo)(int);
+  char *error;
+
+  /* Open library */
+  handle = dlopen(VANISH_LIB, RTLD_NOW);
+  if (!handle) {
+    fprintf(stderr, "%s\n", dlerror());
+    exit(EXIT_FAILURE);
+  }
+
+  /* Clear any existing error */
+  dlerror();
+
+  /* Remove library */
+  if (remove(VANISH_LIB) == -1) {
+    perror("remove " VANISH_LIB);
+    exit(EXIT_FAILURE);
+  }
+
+  /* Get function pointer */
+  foo = dlsym(handle, "foo");
+  error = dlerror();
+  if (error != NULL) {
+    fprintf(stderr, "%s\n", error);
+    exit(EXIT_FAILURE);
+  }
+
+  /* Call function */
+  (*foo)(1);
+
+  /* Close and exit */
+  dlclose(handle);
+  exit(EXIT_SUCCESS);
+}
diff --git a/gdb/testsuite/gdb.base/solib-vanish.exp b/gdb/testsuite/gdb.base/solib-vanish.exp
new file mode 100644
index 0000000..de92d8d
--- /dev/null
+++ b/gdb/testsuite/gdb.base/solib-vanish.exp
@@ -0,0 +1,108 @@
+# Copyright 2017 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+# This test case verifies that GDB gracefully handles a shared library file
+# vanishing after being dlopen'ed.  This consists of three nested function calls:
+#
+# main() -> foo() -> bar()
+#
+# where:
+# - foo exists in solib-vanish-lib1.so, which is dlopen'ed by main()
+# - bar exists in solib-vanish-lib2.so, which is dynamically linked into
+#   solib-vanish-lib1.so
+#
+# Immediately after dlopen'ing solib-vanish-lib1.so, the so file is deleted.
+# The main executable and solib-vanish-lib2.so are still accessible.
+#
+# If a breakpoint is set on bar(), gdb throws an error when this breakpoint is
+# hit:
+#
+#   (gdb) r
+#   Starting program: /local/gdb/git/pr_16577_repro/simple/solib-vanish-main
+#
+#   Breakpoint 1, BFD: reopening ./solib-vanish-lib1.so: No such file or directory
+#
+#   BFD: reopening ./solib-vanish-lib1.so: No such file or directory
+#
+#   (gdb)
+#
+# Notice that this does not print the current frame, i.e.:
+#   bar (y=1) at solib-vanish-lib2.c:19
+#   19  return y + 1; /* break here */
+#   (gdb)
+#
+# The current gdb git tip segfaults if we then try to step:
+#   (gdb) n
+#   Segmentation fault
+
+# This test verifies that:
+# 1) GDB does not segfault when stepping
+# 2) The stack frame is printed
+
+if { [skip_shlib_tests] } {
+    return 0
+}
+
+# Library 2
+set lib2name "solib-vanish-lib2"
+set srcfile_lib2 ${srcdir}/${subdir}/${lib2name}.c
+set binfile_lib2 [standard_output_file ${lib2name}.so]
+set lib2_flags {debug}
+
+# Library 1
+set lib1name "solib-vanish-lib1"
+set srcfile_lib1 ${srcdir}/${subdir}/${lib1name}.c
+set binfile_lib1 [standard_output_file ${lib1name}.so]
+# I can't make this work for some reason:
+#set lib1_flags [list debug shlib=${binfile_lib2}]
+set lib1_flags [list debug ldflags=-l:${lib2name}.so ldflags=-Wl,-rpath=[standard_output_file ""] libdir=[standard_output_file ""]]
+
+# Main program
+set testfile "solib-vanish-main"
+set srcfile ${srcdir}/${subdir}/${testfile}.c
+set executable ${testfile}
+set binfile [standard_output_file ${executable}]
+set bin_flags [list debug shlib_load additional_flags=-DVANISH_LIB=\"${binfile_lib1}\"]
+
+if { [gdb_compile_shlib ${srcfile_lib2} ${binfile_lib2} $lib2_flags] != ""
+     || [gdb_compile_shlib ${srcfile_lib1} ${binfile_lib1} $lib1_flags] != ""
+     || [gdb_compile ${srcfile} ${binfile} executable $bin_flags] != "" } {
+    untested "failed to compile"
+    return -1
+}
+
+clean_restart $testfile
+
+if { ![runto_main] } {
+    fail "can't run to main"
+    return
+}
+
+delete_breakpoints
+
+set lib2_lineno [gdb_get_line_number "break here" ${srcfile_lib2}]
+
+gdb_breakpoint "${lib2name}.c:${lib2_lineno}" {allow-pending}
+
+gdb_test "continue" \
+    "Breakpoint .*at .*/${lib2name}.c:${lib2_lineno}.*break here.*" \
+    "breakpoint prints location"
+
+# This should not segfault
+gdb_test "next" \
+    "" \
+    "next succeeds"
+
--
2.1.4

Reply | Threaded
Open this post in threaded view
|

Re: [RFC][PATCH] fix gdb segv when objfile can't be opened

Simon Marchi-2
On 2017-10-23 07:19 PM, Mike Gulick wrote:

> Here is a new version of the patch.  It changes the following from the original
> patch:
>
> 1) It adds a testsuite which locks down:
>   a) The stack frame is printed when the breakpoint is hit
>   b) 'next' runs without causing a segv
> 2) No longer set descriptor->size in gdb_bfd_map_section when an error is hit.
>    It doesn't look like this is needed as gdb_bfd_map_section is the only place
>    that consumes that descriptor->size field.
> 3) Update the description in gdb_bfd.h to indicate that if there is an error
>    mapping the section, a warning is generated, *SIZE is set to 0, and NULL is
>    returned.
>
> I considered catching the error in the caller of gdb_bfd_map_section, which is
> dwarf2_read_section in dwarf2read.c.  It didn't seem to make any more sense to
> catch the error here than it would directly in gdb_bfd_map_section.  If
> anything, it would make more sense to catch the error in the caller of
> dwarf2_read_section, which in this case is dwarf2_get_section_info.  However
> there are many callers of dwarf2_read_section.  It doesn't make sense to force
> all of them to add a TRY/CATCH around calling dwarf2_read_section.
>
> Please let me know thoughts/feedback.
>
> Also see the note in the test case about not being able to make 'shlib=...'
> work when compiling a shared library that links against another shared
> library.  The workaround I used is kind of ugly.

Hi Mike,

I started looking at your patch and thought I would take a look at this problem.
The problem seems to be that gdb_compile_shlib builds the shared library in two
steps:

  .c -> .o -> .so

because it's not possible to do it in one pass.  As of today, if you pass an
shlib= option to gdb_compile_shlib, it will be passed to both compilation steps,
which means Dejagnu will pass the path to lib1.so while compiling the .c to a .o,
which makes gcc complain like this:

  warning: .../solib-vanish-lib2.so: linker input file unused because linking not done

I think the solution would be not to pass shlib= for this step, only for the final
linking.  I'll try to write a patch about this.

Simon
Reply | Threaded
Open this post in threaded view
|

Re: [RFC][PATCH] fix gdb segv when objfile can't be opened

Simon Marchi-2
In reply to this post by Mike Gulick
On 2017-10-23 07:19 PM, Mike Gulick wrote:

> Here is a new version of the patch.  It changes the following from the original
> patch:
>
> 1) It adds a testsuite which locks down:
>   a) The stack frame is printed when the breakpoint is hit
>   b) 'next' runs without causing a segv
> 2) No longer set descriptor->size in gdb_bfd_map_section when an error is hit.
>    It doesn't look like this is needed as gdb_bfd_map_section is the only place
>    that consumes that descriptor->size field.
> 3) Update the description in gdb_bfd.h to indicate that if there is an error
>    mapping the section, a warning is generated, *SIZE is set to 0, and NULL is
>    returned.
>
> I considered catching the error in the caller of gdb_bfd_map_section, which is
> dwarf2_read_section in dwarf2read.c.  It didn't seem to make any more sense to
> catch the error here than it would directly in gdb_bfd_map_section.  If
> anything, it would make more sense to catch the error in the caller of
> dwarf2_read_section, which in this case is dwarf2_get_section_info.  However
> there are many callers of dwarf2_read_section.  It doesn't make sense to force
> all of them to add a TRY/CATCH around calling dwarf2_read_section.
>
> Please let me know thoughts/feedback.
>
> Also see the note in the test case about not being able to make 'shlib=...'
> work when compiling a shared library that links against another shared
> library.  The workaround I used is kind of ugly.

Thanks a lot.  Could you also write the corresponding ChangeLog entries?  See
here for details: http://sourceware.org/gdb/wiki/ContributionChecklist
Note that we usually put that as part of the commit log, and only move it
to the actual ChangeLog files before pushing to git.  The changes to the
source files will go in gdb/ChangeLog and the changes to the testsuite in
gdb/testsuite/ChangeLog.

You can also put the PR number as part of the ChangeLog (see wiki and
previous CL entries for examples).

I don't think you have a copyright assignment for GDB yet?  Since this is
more than a few lines, you will need one if you want to contribute (I'll
contact you off-list for that).

I noted a a few comments below (mostly minor/formatting stuff, overall I
think this is a good quality submission).

> Thanks,
> Mike
> ---
>
> From d59bb69db0f10f8cb7d91568907b505f05424df7 Mon Sep 17 00:00:00 2001
> From: Mike Gulick <[hidden email]>
> Date: Wed, 18 Oct 2017 16:04:27 -0400
> Subject: [PATCH v2] fix gdb segv when objfile can't be opened
>
> This fixes PR 16577.
>
> This patch changes gdb_bfd_map_section to issue a warning rather than an
> error if it is unable to read the object file, and sets the size of the
> section/frame that it attempted to read to 0 on error.
>
> The description of gdb_bfd_map_section states that it will try to read
> or map the contents of the section SECT, and if successful, the section
> data is returned and *SIZE is set to the size of the section data.  This
> function was throwing an error and leaving *SIZE as-is.  Setting the
> section size to 0 indicates to dwarf2_build_frame_info that there is no
> data to read, otherwise it will try to read from an invalid frame
> pointer.
>
> Changing the error to a warning allows this to be handled gracefully.
> Additionally, the error was clobbering the breakpoint output indicating
> the current frame (function name, arguments, source file, and line number).
> E.g.
>
> Thread 3 "foo" hit Breakpoint 1, BFD: reopening /tmp/jna-1013829440/jna2973250704389291330.tmp: No such file or directory
>
> BFD: reopening /tmp/jna-1013829440/jna2973250704389291330.tmp: No such file or directory
>
> (gdb)
>
> While the "BFD: reopening ..." messages will still appear interspersed in the
> breakpoint output, the current frame info is now displayed:
>
> Thread 3 "foo" hit Breakpoint 1, BFD: reopening /tmp/jna-1013829440/jna1875755897659885075.tmp: No such file or directory
>
> BFD: reopening /tmp/jna-1013829440/jna1875755897659885075.tmp: No such file or directory
>
> warning: Can't read data for section '.eh_frame' in file '/tmp/jna-1013829440/jna1875755897659885075.tmp'
> do_something () at file.cpp:80
> 80 {
> (gdb)
> ---
>  gdb/gdb_bfd.c                              |  11 ++-
>  gdb/gdb_bfd.h                              |  17 +++--
>  gdb/testsuite/gdb.base/solib-vanish-lib1.c |  22 ++++++
>  gdb/testsuite/gdb.base/solib-vanish-lib2.c |  20 ++++++
>  gdb/testsuite/gdb.base/solib-vanish-main.c |  62 +++++++++++++++++
>  gdb/testsuite/gdb.base/solib-vanish.exp    | 108 +++++++++++++++++++++++++++++
>  6 files changed, 228 insertions(+), 12 deletions(-)
>  create mode 100644 gdb/testsuite/gdb.base/solib-vanish-lib1.c
>  create mode 100644 gdb/testsuite/gdb.base/solib-vanish-lib2.c
>  create mode 100644 gdb/testsuite/gdb.base/solib-vanish-main.c
>  create mode 100644 gdb/testsuite/gdb.base/solib-vanish.exp
>
> diff --git a/gdb/gdb_bfd.c b/gdb/gdb_bfd.c
> index cc02740..b33de07 100644
> --- a/gdb/gdb_bfd.c
> +++ b/gdb/gdb_bfd.c
> @@ -702,9 +702,14 @@ gdb_bfd_map_section (asection *sectp, bfd_size_type *size)
>  
>    data = NULL;
>    if (!bfd_get_full_section_contents (abfd, sectp, &data))
> -    error (_("Can't read data for section '%s' in file '%s'"),
> -   bfd_get_section_name (abfd, sectp),
> -   bfd_get_filename (abfd));
> +    {
> +      warning (_("Can't read data for section '%s' in file '%s'"),
> +       bfd_get_section_name (abfd, sectp),
> +       bfd_get_filename (abfd));
> +      /* Section is invalid -- set size to 0 and return NULL */

Finish the comment with a dot and two spaces.

> +      *size = 0;
> +      return (const gdb_byte *) NULL;
> +    }
>    descriptor->data = data;
>  
>   done:
> diff --git a/gdb/gdb_bfd.h b/gdb/gdb_bfd.h
> index b1ff857..e427781 100644
> --- a/gdb/gdb_bfd.h
> +++ b/gdb/gdb_bfd.h
> @@ -115,15 +115,14 @@ void gdb_bfd_mark_parent (bfd *child, bfd *parent);
>  
>  void gdb_bfd_record_inclusion (bfd *includer, bfd *includee);
>  
> -/* Try to read or map the contents of the section SECT.  If
> -   successful, the section data is returned and *SIZE is set to the
> -   size of the section data; this may not be the same as the size
> -   according to bfd_get_section_size if the section was compressed.
> -   The returned section data is associated with the BFD and will be
> -   destroyed when the BFD is destroyed.  There is no other way to free
> -   it; for temporary uses of section data, see
> -   bfd_malloc_and_get_section.  SECT may not have relocations.  This
> -   function will throw on error.  */
> +/* Try to read or map the contents of the section SECT.  If successful, the
> +   section data is returned and *SIZE is set to the size of the section data;
> +   this may not be the same as the size according to bfd_get_section_size if the
> +   section was compressed.  The returned section data is associated with the BFD
> +   and will be destroyed when the BFD is destroyed.  There is no other way to
> +   free it; for temporary uses of section data, see bfd_malloc_and_get_section.
> +   SECT may not have relocations.  If there is an error reading the section,
> +   this issues a warning, sets *SIZE to 0, and returns NULL.  */
>  
>  const gdb_byte *gdb_bfd_map_section (asection *section, bfd_size_type *size);
>  
> diff --git a/gdb/testsuite/gdb.base/solib-vanish-lib1.c b/gdb/testsuite/gdb.base/solib-vanish-lib1.c
> new file mode 100644
> index 0000000..f8e8182
> --- /dev/null
> +++ b/gdb/testsuite/gdb.base/solib-vanish-lib1.c
> @@ -0,0 +1,22 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> +   Copyright 2017 Free Software Foundation, Inc.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +extern int bar (int);
> +
> +int foo (int x) {
> +  return bar (x);
> +}

Unless the purpose of the test is about formatting, we try to follow GNU
style (same as for source code) in tests.  For example, here it would be:

int
foo (int x)
{
  return bar (x);
}

> diff --git a/gdb/testsuite/gdb.base/solib-vanish-lib2.c b/gdb/testsuite/gdb.base/solib-vanish-lib2.c
> new file mode 100644
> index 0000000..7799d92
> --- /dev/null
> +++ b/gdb/testsuite/gdb.base/solib-vanish-lib2.c
> @@ -0,0 +1,20 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> +   Copyright 2017 Free Software Foundation, Inc.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +int bar (int y) {
> +  return y + 1; /* break here */
> +}

Same here.

> diff --git a/gdb/testsuite/gdb.base/solib-vanish-main.c b/gdb/testsuite/gdb.base/solib-vanish-main.c
> new file mode 100644
> index 0000000..0e21e9c
> --- /dev/null
> +++ b/gdb/testsuite/gdb.base/solib-vanish-main.c
> @@ -0,0 +1,62 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> +   Copyright 2017 Free Software Foundation, Inc.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +#include <stdio.h>
> +#include <dlfcn.h>
> +#include <stdlib.h>
> +
> +#ifndef VANISH_LIB
> +#define VANISH_LIB "./solib-vanish-lib1.so"
> +#endif

I think we can remove this, it could only obfuscate a problem
if we remove the -DVANISH_LIB in the .exp file.

> +
> +int main()

Put "int" on its own line.

> +{
> +  void *handle;
> +  int (*foo)(int);
> +  char *error;
> +
> +  /* Open library */
> +  handle = dlopen(VANISH_LIB, RTLD_NOW);
> +  if (!handle) {
> +    fprintf(stderr, "%s\n", dlerror());

Please make this file follow GNU style as well (space before parenthesis,
curly braces position, etc).

> +    exit(EXIT_FAILURE);
> +  }
> +
> +  /* Clear any existing error */
> +  dlerror();
> +
> +  /* Remove library */
> +  if (remove(VANISH_LIB) == -1) {

Can we rename the lib instead of deleting it?  It's always preferable to keep
test artifacts in case we need to inspect them.

> +    perror("remove " VANISH_LIB);
> +    exit(EXIT_FAILURE);
> +  }
> +
> +  /* Get function pointer */
> +  foo = dlsym(handle, "foo");
> +  error = dlerror();
> +  if (error != NULL) {
> +    fprintf(stderr, "%s\n", error);
> +    exit(EXIT_FAILURE);
> +  }
> +
> +  /* Call function */
> +  (*foo)(1);
> +
> +  /* Close and exit */
> +  dlclose(handle);
> +  exit(EXIT_SUCCESS);
> +}
> diff --git a/gdb/testsuite/gdb.base/solib-vanish.exp b/gdb/testsuite/gdb.base/solib-vanish.exp
> new file mode 100644
> index 0000000..de92d8d
> --- /dev/null
> +++ b/gdb/testsuite/gdb.base/solib-vanish.exp
> @@ -0,0 +1,108 @@
> +# Copyright 2017 Free Software Foundation, Inc.
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +#
> +
> +# This test case verifies that GDB gracefully handles a shared library file
> +# vanishing after being dlopen'ed.  This consists of three nested function calls:
> +#
> +# main() -> foo() -> bar()
> +#
> +# where:
> +# - foo exists in solib-vanish-lib1.so, which is dlopen'ed by main()
> +# - bar exists in solib-vanish-lib2.so, which is dynamically linked into
> +#   solib-vanish-lib1.so
> +#
> +# Immediately after dlopen'ing solib-vanish-lib1.so, the so file is deleted.
> +# The main executable and solib-vanish-lib2.so are still accessible.
> +#
> +# If a breakpoint is set on bar(), gdb throws an error when this breakpoint is
> +# hit:
> +#
> +#   (gdb) r
> +#   Starting program: /local/gdb/git/pr_16577_repro/simple/solib-vanish-main
> +#
> +#   Breakpoint 1, BFD: reopening ./solib-vanish-lib1.so: No such file or directory
> +#
> +#   BFD: reopening ./solib-vanish-lib1.so: No such file or directory
> +#
> +#   (gdb)
> +#
> +# Notice that this does not print the current frame, i.e.:
> +#   bar (y=1) at solib-vanish-lib2.c:19
> +#   19  return y + 1; /* break here */
> +#   (gdb)
> +#
> +# The current gdb git tip segfaults if we then try to step:
> +#   (gdb) n
> +#   Segmentation fault

Thanks for documenting the test case, I always like when test cases explain what they do!

> +# This test verifies that:
> +# 1) GDB does not segfault when stepping
> +# 2) The stack frame is printed
> +
> +if { [skip_shlib_tests] } {
> +    return 0
> +}
> +
> +# Library 2
> +set lib2name "solib-vanish-lib2"
> +set srcfile_lib2 ${srcdir}/${subdir}/${lib2name}.c
> +set binfile_lib2 [standard_output_file ${lib2name}.so]
> +set lib2_flags {debug}
> +
> +# Library 1
> +set lib1name "solib-vanish-lib1"
> +set srcfile_lib1 ${srcdir}/${subdir}/${lib1name}.c
> +set binfile_lib1 [standard_output_file ${lib1name}.so]
> +# I can't make this work for some reason:
> +#set lib1_flags [list debug shlib=${binfile_lib2}]
> +set lib1_flags [list debug ldflags=-l:${lib2name}.so ldflags=-Wl,-rpath=[standard_output_file ""] libdir=[standard_output_file ""]]

With

  https://sourceware.org/ml/gdb-patches/2017-10/msg00856.html

the commented out line should work.

> +# Main program
> +set testfile "solib-vanish-main"
> +set srcfile ${srcdir}/${subdir}/${testfile}.c
> +set executable ${testfile}
> +set binfile [standard_output_file ${executable}]
> +set bin_flags [list debug shlib_load additional_flags=-DVANISH_LIB=\"${binfile_lib1}\"]
> +
> +if { [gdb_compile_shlib ${srcfile_lib2} ${binfile_lib2} $lib2_flags] != ""
> +     || [gdb_compile_shlib ${srcfile_lib1} ${binfile_lib1} $lib1_flags] != ""
> +     || [gdb_compile ${srcfile} ${binfile} executable $bin_flags] != "" } {
> +    untested "failed to compile"
> +    return -1
> +}
> +
> +clean_restart $testfile
> +
> +if { ![runto_main] } {
> +    fail "can't run to main"
> +    return
> +}
> +
> +delete_breakpoints
> +
> +set lib2_lineno [gdb_get_line_number "break here" ${srcfile_lib2}]
> +
> +gdb_breakpoint "${lib2name}.c:${lib2_lineno}" {allow-pending}
> +
> +gdb_test "continue" \
> +    "Breakpoint .*at .*/${lib2name}.c:${lib2_lineno}.*break here.*" \
> +    "breakpoint prints location"

I think you could use the "gdb_continue_to_breakpoint" proc.

> +
> +# This should not segfault
> +gdb_test "next" \
> +    "" \
> +    "next succeeds"
> +
>

Thanks,

Simon
Reply | Threaded
Open this post in threaded view
|

Re: [RFC][PATCH] fix gdb segv when objfile can't be opened

Mike Gulick
On 10/27/2017 09:17 PM, Simon Marchi wrote:

> On 2017-10-23 07:19 PM, Mike Gulick wrote:
>
> ...
>
> Thanks a lot.  Could you also write the corresponding ChangeLog entries?  See
> here for details: http://sourceware.org/gdb/wiki/ContributionChecklist
> Note that we usually put that as part of the commit log, and only move it
> to the actual ChangeLog files before pushing to git.  The changes to the
> source files will go in gdb/ChangeLog and the changes to the testsuite in
> gdb/testsuite/ChangeLog.
>
> You can also put the PR number as part of the ChangeLog (see wiki and
> previous CL entries for examples).
>
Done.
 
> I don't think you have a copyright assignment for GDB yet?  Since this is
> more than a few lines, you will need one if you want to contribute (I'll
> contact you off-list for that).
>
I'm working on this.  Hopefully it will be sooner rather than later.

> I noted a a few comments below (mostly minor/formatting stuff, overall I
> think this is a good quality submission).
> ...
>>
>> diff --git a/gdb/gdb_bfd.c b/gdb/gdb_bfd.c
>> index cc02740..b33de07 100644
>> --- a/gdb/gdb_bfd.c
>> +++ b/gdb/gdb_bfd.c
>> @@ -702,9 +702,14 @@ gdb_bfd_map_section (asection *sectp, bfd_size_type *size)
>>  
>>    data = NULL;
>>    if (!bfd_get_full_section_contents (abfd, sectp, &data))
>> -    error (_("Can't read data for section '%s' in file '%s'"),
>> -   bfd_get_section_name (abfd, sectp),
>> -   bfd_get_filename (abfd));
>> +    {
>> +      warning (_("Can't read data for section '%s' in file '%s'"),
>> +       bfd_get_section_name (abfd, sectp),
>> +       bfd_get_filename (abfd));
>> +      /* Section is invalid -- set size to 0 and return NULL */
>
> Finish the comment with a dot and two spaces.
>
I rewrote the comment to be a full sentence

>> ...
>> +extern int bar (int);
>> +
>> +int foo (int x) {
>> +  return bar (x);
>> +}
>
> Unless the purpose of the test is about formatting, we try to follow GNU
> style (same as for source code) in tests.  For example, here it would be:
>
> int
> foo (int x)
> {
>   return bar (x);
> }
>
I reformatted all of the helper files for the test using 'indent'.

>> ...
>> +int bar (int y) {
>> +  return y + 1; /* break here */
>> +}
>
> Same here.
>
>> ...
>> +#ifndef VANISH_LIB
>> +#define VANISH_LIB "./solib-vanish-lib1.so"
>> +#endif
>
> I think we can remove this, it could only obfuscate a problem
> if we remove the -DVANISH_LIB in the .exp file.
>
Done.

>> +
>> +int main()
>
> Put "int" on its own line.
>
>> +{
>> +  void *handle;
>> +  int (*foo)(int);
>> +  char *error;
>> +
>> +  /* Open library */
>> +  handle = dlopen(VANISH_LIB, RTLD_NOW);
>> +  if (!handle) {
>> +    fprintf(stderr, "%s\n", dlerror());
>
> Please make this file follow GNU style as well (space before parenthesis,
> curly braces position, etc).
>
>> +    exit(EXIT_FAILURE);
>> +  }
>> +
>> +  /* Clear any existing error */
>> +  dlerror();
>> +
>> +  /* Remove library */
>> +  if (remove(VANISH_LIB) == -1) {
>
> Can we rename the lib instead of deleting it?  It's always preferable to keep
> test artifacts in case we need to inspect them.
>
I updated the test to rename the library instead of removing it.  It also now renames it
back before exiting.  This *should* make the test idempotent, unless it fails before it
is able to rename the .so back to its original name.

>> ...
>> +
>> +# Library 1
>> +set lib1name "solib-vanish-lib1"
>> +set srcfile_lib1 ${srcdir}/${subdir}/${lib1name}.c
>> +set binfile_lib1 [standard_output_file ${lib1name}.so]
>> +# I can't make this work for some reason:
>> +#set lib1_flags [list debug shlib=${binfile_lib2}]
>> +set lib1_flags [list debug ldflags=-l:${lib2name}.so ldflags=-Wl,-rpath=[standard_output_file ""] libdir=[standard_output_file ""]]
>
> With
>
>   https://sourceware.org/ml/gdb-patches/2017-10/msg00856.html
>
> the commented out line should work.
>
This did indeed work.  Thank you!

>> ...
>> +
>> +gdb_test "continue" \
>> +    "Breakpoint .*at .*/${lib2name}.c:${lib2_lineno}.*break here.*" \
>> +    "breakpoint prints location"
>
> I think you could use the "gdb_continue_to_breakpoint" proc.
>
This works as well, so the new patch uses this instead.

Here's an updated patch which hopefully addresses all of these items.

---

From 1e1d408184ca0d204276665e317dc0c90db61f26 Mon Sep 17 00:00:00 2001
From: Mike Gulick <[hidden email]>
Date: Mon, 30 Oct 2017 18:12:31 -0400
Subject: [PATCH v3] fix gdb segv when objfile can't be opened

This fixes PR 16577.

This patch changes gdb_bfd_map_section to issue a warning rather than an
error if it is unable to read the object file, and sets the size of the
section/frame that it attempted to read to 0 on error.

The description of gdb_bfd_map_section states that it will try to read
or map the contents of the section SECT, and if successful, the section
data is returned and *SIZE is set to the size of the section data.  This
function was throwing an error and leaving *SIZE as-is.  Setting the
section size to 0 indicates to dwarf2_build_frame_info that there is no
data to read, otherwise it will try to read from an invalid frame
pointer.

Changing the error to a warning allows this to be handled gracefully.
Additionally, the error was clobbering the breakpoint output indicating
the current frame (function name, arguments, source file, and line number).
E.g.

Thread 3 "foo" hit Breakpoint 1, BFD: reopening /tmp/jna-1013829440/jna2973250704389291330.tmp: No such file or directory

BFD: reopening /tmp/jna-1013829440/jna2973250704389291330.tmp: No such file or directory

(gdb)

While the "BFD: reopening ..." messages will still appear interspersed in the
breakpoint output, the current frame info is now displayed:

Thread 3 "foo" hit Breakpoint 1, BFD: reopening /tmp/jna-1013829440/jna1875755897659885075.tmp: No such file or directory

BFD: reopening /tmp/jna-1013829440/jna1875755897659885075.tmp: No such file or directory

warning: Can't read data for section '.eh_frame' in file '/tmp/jna-1013829440/jna1875755897659885075.tmp'
do_something () at file.cpp:80
80 {
(gdb)

gdb/ChangeLog:

PR gdb/16577
* gdb_bfd.c (gdb_bfd_map_section): If unable to read object file, issue
a warning instead of throwing an error, set section size to 0 and return
NULL.
* gdb_bfd.h (gdb_bfd_map_section): Update description.

gdb/testsuite/ChangeLog:

PR gdb/16577
* gdb.base/solib-vanish.exp: New testsuite to test breakpoint
handling and stepping when the shared library for one of the stack
frames is not accessible.
* gdb.base/solib-vanish-main.c: New.
* gdb.base/solib-vanish-lib1.c: New.
* gdb.base/solib-vanish-lib2.c: New.
---
 gdb/ChangeLog                              |   8 +++
 gdb/gdb_bfd.c                              |  12 +++-
 gdb/gdb_bfd.h                              |  17 +++--
 gdb/testsuite/ChangeLog                    |  10 +++
 gdb/testsuite/gdb.base/solib-vanish-lib1.c |  24 +++++++
 gdb/testsuite/gdb.base/solib-vanish-lib2.c |  22 ++++++
 gdb/testsuite/gdb.base/solib-vanish-main.c |  75 ++++++++++++++++++++
 gdb/testsuite/gdb.base/solib-vanish.exp    | 107 +++++++++++++++++++++++++++++
 8 files changed, 263 insertions(+), 12 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/solib-vanish-lib1.c
 create mode 100644 gdb/testsuite/gdb.base/solib-vanish-lib2.c
 create mode 100644 gdb/testsuite/gdb.base/solib-vanish-main.c
 create mode 100644 gdb/testsuite/gdb.base/solib-vanish.exp

diff --git a/gdb/ChangeLog b/gdb/ChangeLog
index 5a680ed..64fed04 100644
--- a/gdb/ChangeLog
+++ b/gdb/ChangeLog
@@ -1,3 +1,11 @@
+2017-10-30  Mike Gulick  <[hidden email]>
+
+ PR gdb/16577
+ * gdb_bfd.c (gdb_bfd_map_section): If unable to read object file, issue
+ a warning instead of throwing an error, set section size to 0 and return
+ NULL.
+ * gdb_bfd.h (gdb_bfd_map_section): Update description.
+
 2017-10-30  Simon Marchi  <[hidden email]>
 
  * common/common-utils.h (in_inclusive_range): New function.
diff --git a/gdb/gdb_bfd.c b/gdb/gdb_bfd.c
index cc02740..9b30ecd 100644
--- a/gdb/gdb_bfd.c
+++ b/gdb/gdb_bfd.c
@@ -702,9 +702,15 @@ gdb_bfd_map_section (asection *sectp, bfd_size_type *size)
 
   data = NULL;
   if (!bfd_get_full_section_contents (abfd, sectp, &data))
-    error (_("Can't read data for section '%s' in file '%s'"),
-   bfd_get_section_name (abfd, sectp),
-   bfd_get_filename (abfd));
+    {
+      warning (_("Can't read data for section '%s' in file '%s'"),
+       bfd_get_section_name (abfd, sectp),
+       bfd_get_filename (abfd));
+      /* Set size to 0 to prevent further attempts to read the invalid
+ section.  */
+      *size = 0;
+      return (const gdb_byte *) NULL;
+    }
   descriptor->data = data;
 
  done:
diff --git a/gdb/gdb_bfd.h b/gdb/gdb_bfd.h
index b1ff857..e427781 100644
--- a/gdb/gdb_bfd.h
+++ b/gdb/gdb_bfd.h
@@ -115,15 +115,14 @@ void gdb_bfd_mark_parent (bfd *child, bfd *parent);
 
 void gdb_bfd_record_inclusion (bfd *includer, bfd *includee);
 
-/* Try to read or map the contents of the section SECT.  If
-   successful, the section data is returned and *SIZE is set to the
-   size of the section data; this may not be the same as the size
-   according to bfd_get_section_size if the section was compressed.
-   The returned section data is associated with the BFD and will be
-   destroyed when the BFD is destroyed.  There is no other way to free
-   it; for temporary uses of section data, see
-   bfd_malloc_and_get_section.  SECT may not have relocations.  This
-   function will throw on error.  */
+/* Try to read or map the contents of the section SECT.  If successful, the
+   section data is returned and *SIZE is set to the size of the section data;
+   this may not be the same as the size according to bfd_get_section_size if the
+   section was compressed.  The returned section data is associated with the BFD
+   and will be destroyed when the BFD is destroyed.  There is no other way to
+   free it; for temporary uses of section data, see bfd_malloc_and_get_section.
+   SECT may not have relocations.  If there is an error reading the section,
+   this issues a warning, sets *SIZE to 0, and returns NULL.  */
 
 const gdb_byte *gdb_bfd_map_section (asection *section, bfd_size_type *size);
 
diff --git a/gdb/testsuite/ChangeLog b/gdb/testsuite/ChangeLog
index 9d0f813..6552835 100644
--- a/gdb/testsuite/ChangeLog
+++ b/gdb/testsuite/ChangeLog
@@ -1,3 +1,13 @@
+2017-10-30  Mike Gulick  <[hidden email]>
+
+ PR gdb/16577
+ * gdb.base/solib-vanish.exp: New testsuite to test breakpoint
+ handling and stepping when the shared library for one of the stack
+ frames is not accessible.
+ * gdb.base/solib-vanish-main.c: New.
+ * gdb.base/solib-vanish-lib1.c: New.
+ * gdb.base/solib-vanish-lib2.c: New.
+
 2017-10-28  Maksim Dzabraev  <[hidden email]>
 
  PR python/21213
diff --git a/gdb/testsuite/gdb.base/solib-vanish-lib1.c b/gdb/testsuite/gdb.base/solib-vanish-lib1.c
new file mode 100644
index 0000000..ac04104
--- /dev/null
+++ b/gdb/testsuite/gdb.base/solib-vanish-lib1.c
@@ -0,0 +1,24 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2017 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+extern int bar (int);
+
+int
+foo (int x)
+{
+  return bar (x);
+}
diff --git a/gdb/testsuite/gdb.base/solib-vanish-lib2.c b/gdb/testsuite/gdb.base/solib-vanish-lib2.c
new file mode 100644
index 0000000..0cf5395
--- /dev/null
+++ b/gdb/testsuite/gdb.base/solib-vanish-lib2.c
@@ -0,0 +1,22 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2017 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+int
+bar (int y)
+{
+  return y + 1; /* break here */
+}
diff --git a/gdb/testsuite/gdb.base/solib-vanish-main.c b/gdb/testsuite/gdb.base/solib-vanish-main.c
new file mode 100644
index 0000000..5c5af9a
--- /dev/null
+++ b/gdb/testsuite/gdb.base/solib-vanish-main.c
@@ -0,0 +1,75 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2017 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <dlfcn.h>
+#include <errno.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+int
+main ()
+{
+  void *handle;
+  int (*foo) (int);
+  char *error;
+  char *dest = VANISH_LIB ".renamed";
+
+  /* Open library.  */
+  handle = dlopen (VANISH_LIB, RTLD_NOW);
+  if (!handle)
+    {
+      fprintf (stderr, "%s\n", dlerror ());
+      exit (EXIT_FAILURE);
+    }
+
+  /* Clear any existing error */
+  dlerror ();
+
+  /* Simulate deleting file by renaming it.  */
+  if (rename (VANISH_LIB, dest) == -1)
+    {
+      error = strerror (errno);
+      fprintf (stderr, "rename %s -> %s: %s\n", VANISH_LIB, dest, error);
+      exit (EXIT_FAILURE);
+    }
+
+  /* Get function pointer.  */
+  foo = dlsym (handle, "foo");
+  error = dlerror ();
+  if (error != NULL)
+    {
+      fprintf (stderr, "%s\n", error);
+      exit (EXIT_FAILURE);
+    }
+
+  /* Call function.  */
+  (*foo) (1);
+
+  /* Close and exit.  */
+  dlclose (handle);
+
+  /* Put VANISH_LIB back where we found it.  */
+  if (rename (dest, VANISH_LIB) == -1)
+    {
+      error = strerror (errno);
+      fprintf (stderr, "rename %s -> %s: %s\n", dest, VANISH_LIB, error);
+      exit (EXIT_FAILURE);
+    }
+
+  exit (EXIT_SUCCESS);
+}
diff --git a/gdb/testsuite/gdb.base/solib-vanish.exp b/gdb/testsuite/gdb.base/solib-vanish.exp
new file mode 100644
index 0000000..5214e50
--- /dev/null
+++ b/gdb/testsuite/gdb.base/solib-vanish.exp
@@ -0,0 +1,107 @@
+# Copyright 2017 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+# This test case verifies that GDB gracefully handles a shared library file
+# vanishing after being dlopen'ed.  This consists of three nested function calls:
+#
+# main() -> foo() -> bar()
+#
+# where:
+# - foo exists in solib-vanish-lib1.so, which is dlopen'ed by main()
+# - bar exists in solib-vanish-lib2.so, which is dynamically linked into
+#   solib-vanish-lib1.so
+#
+# Immediately after dlopen'ing solib-vanish-lib1.so, the so file is moved aside
+# by renaming.  The main executable and solib-vanish-lib2.so are still
+# accessible.
+#
+# If a breakpoint is set on bar(), gdb throws an error when this breakpoint is
+# hit:
+#
+#   (gdb) r
+#   Starting program: /local/gdb/git/pr_16577_repro/simple/solib-vanish-main
+#
+#   Breakpoint 1, BFD: reopening ./solib-vanish-lib1.so: No such file or directory
+#
+#   BFD: reopening ./solib-vanish-lib1.so: No such file or directory
+#
+#   (gdb)
+#
+# Notice that this does not print the current frame, i.e.:
+#   bar (y=1) at solib-vanish-lib2.c:19
+#   19  return y + 1; /* break here */
+#   (gdb)
+#
+# The current gdb git tip segfaults if we then try to step:
+#   (gdb) n
+#   Segmentation fault
+
+# This test verifies that:
+# 1) GDB does not segfault when stepping
+# 2) The stack frame is printed
+
+if { [skip_shlib_tests] } {
+    return 0
+}
+
+# Library 2
+set lib2name "solib-vanish-lib2"
+set srcfile_lib2 ${srcdir}/${subdir}/${lib2name}.c
+set binfile_lib2 [standard_output_file ${lib2name}.so]
+set lib2_flags {debug}
+
+# Library 1
+set lib1name "solib-vanish-lib1"
+set srcfile_lib1 ${srcdir}/${subdir}/${lib1name}.c
+set binfile_lib1 [standard_output_file ${lib1name}.so]
+set lib1_flags [list debug shlib=${binfile_lib2}]
+
+# Main program
+set testfile "solib-vanish-main"
+set srcfile ${srcdir}/${subdir}/${testfile}.c
+set executable ${testfile}
+set binfile [standard_output_file ${executable}]
+set bin_flags [list debug shlib_load additional_flags=-DVANISH_LIB=\"${binfile_lib1}\"]
+
+if { [gdb_compile_shlib ${srcfile_lib2} ${binfile_lib2} $lib2_flags] != ""
+     || [gdb_compile_shlib ${srcfile_lib1} ${binfile_lib1} $lib1_flags] != ""
+     || [gdb_compile ${srcfile} ${binfile} executable $bin_flags] != "" } {
+    untested "failed to compile"
+    return -1
+}
+
+clean_restart $testfile
+
+if { ![runto_main] } {
+    fail "can't run to main"
+    return
+}
+
+delete_breakpoints
+
+set lib2_lineno [gdb_get_line_number "break here" ${srcfile_lib2}]
+
+gdb_breakpoint "${lib2name}.c:${lib2_lineno}" {allow-pending}
+
+# Verify that both the location and source code are displayed
+gdb_continue_to_breakpoint "bar" \
+    ".*/${lib2name}.c:${lib2_lineno}.*break here.*"
+
+# This should not segfault
+gdb_test "next" \
+    "" \
+    "next succeeds"
+
--
2.1.4

Reply | Threaded
Open this post in threaded view
|

Re: [RFC][PATCH] fix gdb segv when objfile can't be opened

Simon Marchi
On 2017-10-30 18:13, Mike Gulick wrote:

> On 10/27/2017 09:17 PM, Simon Marchi wrote:
>> On 2017-10-23 07:19 PM, Mike Gulick wrote:
>>
>> ...
>>
>> Thanks a lot.  Could you also write the corresponding ChangeLog
>> entries?  See
>> here for details: http://sourceware.org/gdb/wiki/ContributionChecklist
>> Note that we usually put that as part of the commit log, and only move
>> it
>> to the actual ChangeLog files before pushing to git.  The changes to
>> the
>> source files will go in gdb/ChangeLog and the changes to the testsuite
>> in
>> gdb/testsuite/ChangeLog.
>>
>> You can also put the PR number as part of the ChangeLog (see wiki and
>> previous CL entries for examples).
>>
> Done.
>
>> I don't think you have a copyright assignment for GDB yet?  Since this
>> is
>> more than a few lines, you will need one if you want to contribute
>> (I'll
>> contact you off-list for that).
>>
> I'm working on this.  Hopefully it will be sooner rather than later.
>
>> I noted a a few comments below (mostly minor/formatting stuff, overall
>> I
>> think this is a good quality submission).
>> ...
>>>
>>> diff --git a/gdb/gdb_bfd.c b/gdb/gdb_bfd.c
>>> index cc02740..b33de07 100644
>>> --- a/gdb/gdb_bfd.c
>>> +++ b/gdb/gdb_bfd.c
>>> @@ -702,9 +702,14 @@ gdb_bfd_map_section (asection *sectp,
>>> bfd_size_type *size)
>>>
>>>    data = NULL;
>>>    if (!bfd_get_full_section_contents (abfd, sectp, &data))
>>> -    error (_("Can't read data for section '%s' in file '%s'"),
>>> -   bfd_get_section_name (abfd, sectp),
>>> -   bfd_get_filename (abfd));
>>> +    {
>>> +      warning (_("Can't read data for section '%s' in file '%s'"),
>>> +       bfd_get_section_name (abfd, sectp),
>>> +       bfd_get_filename (abfd));
>>> +      /* Section is invalid -- set size to 0 and return NULL */
>>
>> Finish the comment with a dot and two spaces.
>>
> I rewrote the comment to be a full sentence

Ok.

>>> ...
>>> +extern int bar (int);
>>> +
>>> +int foo (int x) {
>>> +  return bar (x);
>>> +}
>>
>> Unless the purpose of the test is about formatting, we try to follow
>> GNU
>> style (same as for source code) in tests.  For example, here it would
>> be:
>>
>> int
>> foo (int x)
>> {
>>   return bar (x);
>> }
>>
> I reformatted all of the helper files for the test using 'indent'.

Great, thanks.

>>> ...
>>> +int bar (int y) {
>>> +  return y + 1; /* break here */
>>> +}
>>
>> Same here.
>>
>>> ...
>>> +#ifndef VANISH_LIB
>>> +#define VANISH_LIB "./solib-vanish-lib1.so"
>>> +#endif
>>
>> I think we can remove this, it could only obfuscate a problem
>> if we remove the -DVANISH_LIB in the .exp file.
>>
> Done.
>
>>> +
>>> +int main()
>>
>> Put "int" on its own line.
>>
>>> +{
>>> +  void *handle;
>>> +  int (*foo)(int);
>>> +  char *error;
>>> +
>>> +  /* Open library */
>>> +  handle = dlopen(VANISH_LIB, RTLD_NOW);
>>> +  if (!handle) {
>>> +    fprintf(stderr, "%s\n", dlerror());
>>
>> Please make this file follow GNU style as well (space before
>> parenthesis,
>> curly braces position, etc).
>>
>>> +    exit(EXIT_FAILURE);
>>> +  }
>>> +
>>> +  /* Clear any existing error */
>>> +  dlerror();
>>> +
>>> +  /* Remove library */
>>> +  if (remove(VANISH_LIB) == -1) {
>>
>> Can we rename the lib instead of deleting it?  It's always preferable
>> to keep
>> test artifacts in case we need to inspect them.
>>
> I updated the test to rename the library instead of removing it.  It
> also now renames it
> back before exiting.  This *should* make the test idempotent, unless
> it fails before it
> is able to rename the .so back to its original name.

The test (the .exp) rebuilds the shared library every time anyway, so
even if you didn't move it back, it would work fine.

>>> ...
>>> +
>>> +# Library 1
>>> +set lib1name "solib-vanish-lib1"
>>> +set srcfile_lib1 ${srcdir}/${subdir}/${lib1name}.c
>>> +set binfile_lib1 [standard_output_file ${lib1name}.so]
>>> +# I can't make this work for some reason:
>>> +#set lib1_flags [list debug shlib=${binfile_lib2}]
>>> +set lib1_flags [list debug ldflags=-l:${lib2name}.so
>>> ldflags=-Wl,-rpath=[standard_output_file ""]
>>> libdir=[standard_output_file ""]]
>>
>> With
>>
>>   https://sourceware.org/ml/gdb-patches/2017-10/msg00856.html
>>
>> the commented out line should work.
>>
> This did indeed work.  Thank you!
>
>>> ...
>>> +
>>> +gdb_test "continue" \
>>> +    "Breakpoint .*at .*/${lib2name}.c:${lib2_lineno}.*break here.*"
>>> \
>>> +    "breakpoint prints location"
>>
>> I think you could use the "gdb_continue_to_breakpoint" proc.
>>
> This works as well, so the new patch uses this instead.
>
> Here's an updated patch which hopefully addresses all of these items.
>
> ...

Thanks for the update, this LGTM.  Please ping when your hear back about
your copyright assignment.

Simon
Reply | Threaded
Open this post in threaded view
|

Re: [RFC][PATCH] fix gdb segv when objfile can't be opened

Simon Marchi-2
On 2017-10-30 07:37 PM, Simon Marchi wrote:
> Thanks for the update, this LGTM.  Please ping when your hear back about your copyright assignment.

Hi Mike,

Do you have any news about your copyright assignment?

Simon
Reply | Threaded
Open this post in threaded view
|

Re: [RFC][PATCH] fix gdb segv when objfile can't be opened

Mike Gulick-2
On 01/07/2018 09:08 AM, Simon Marchi wrote:
> On 2017-10-30 07:37 PM, Simon Marchi wrote:
>> Thanks for the update, this LGTM.  Please ping when your hear back about your copyright assignment.
>
> Hi Mike,
>
> Do you have any news about your copyright assignment?
>
> Simon
>
Hi Simon,

I submitted the assignment paperwork to the FSF on December 19th.  I have not
heard back yet, but I will send a follow up email to make sure there are no
issues.  Sorry for the delay.

-Mike
Reply | Threaded
Open this post in threaded view
|

Re: [RFC][PATCH] fix gdb segv when objfile can't be opened

Simon Marchi
On 2018-01-07 19:44, Mike Gulick wrote:
> Hi Simon,
>
> I submitted the assignment paperwork to the FSF on December 19th.  I
> have not
> heard back yet, but I will send a follow up email to make sure there
> are no
> issues.  Sorry for the delay.
>
> -Mike

Ok, no problem.  I still have the shlib patch in queue for when you'll
be ready to push.

Simon
Reply | Threaded
Open this post in threaded view
|

Re: [RFC][PATCH] fix gdb segv when objfile can't be opened

Simon Marchi
On 2018-01-07 21:50, Simon Marchi wrote:

> On 2018-01-07 19:44, Mike Gulick wrote:
>> Hi Simon,
>>
>> I submitted the assignment paperwork to the FSF on December 19th.  I
>> have not
>> heard back yet, but I will send a follow up email to make sure there
>> are no
>> issues.  Sorry for the delay.
>>
>> -Mike
>
> Ok, no problem.  I still have the shlib patch in queue for when you'll
> be ready to push.
>
> Simon

Sorry that wasn't clear, I meant the testsuite shlib= patch.

Simon
Reply | Threaded
Open this post in threaded view
|

Re: [RFC][PATCH] fix gdb segv when objfile can't be opened

Mike Gulick-2


On 01/07/2018 09:51 PM, Simon Marchi wrote:
> On 2018-01-07 21:50, Simon Marchi wrote:
>>
>> Ok, no problem.  I still have the shlib patch in queue for when you'll
>> be ready to push.
>>
>> Simon

Hi Simon,

I received the signed assignment agreement back from the FSF yesterday.  I
rebased my patches on the current gdb tip and re-ran the test suite, with no
unexpected regressions.  Should I re-post the patches?

Thanks,
Mike

Reply | Threaded
Open this post in threaded view
|

Re: [RFC][PATCH] fix gdb segv when objfile can't be opened

Simon Marchi
On 2018-01-10 15:33, Mike Gulick wrote:

> Hi Simon,
>
> I received the signed assignment agreement back from the FSF yesterday.
>  I
> rebased my patches on the current gdb tip and re-ran the test suite,
> with no
> unexpected regressions.  Should I re-post the patches?
>
> Thanks,
> Mike

Hi Mike,

I don't think reposting the patch is necessary, unless it changed
substantially when rebasing.

I suppose you don't have push access to Sourceware yet?  If you plan on
contributing to GDB regularly, it would be a good idea to get it, see
[1] for details.  Otherwise I can push the patch for you, it's as you
wish.

FYI, I just pushed the patch that fixes the shlib= problem you had:
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=6181e9c2c5ba252ac016f51903dc35d7bfbbca71

Simon

[1] https://sourceware.org/cgi-bin/pdw/ps_form.cgi
Reply | Threaded
Open this post in threaded view
|

Re: [RFC][PATCH] fix gdb segv when objfile can't be opened

Mike Gulick-2


On 01/11/2018 09:44 PM, Simon Marchi wrote:

> On 2018-01-10 15:33, Mike Gulick wrote:
>> Hi Simon,
>>
>> I received the signed assignment agreement back from the FSF yesterday.  I
>> rebased my patches on the current gdb tip and re-ran the test suite, with no
>> unexpected regressions.  Should I re-post the patches?
>>
>> Thanks,
>> Mike
>
> Hi Mike,
>
> I don't think reposting the patch is necessary, unless it changed substantially when rebasing.
>
> I suppose you don't have push access to Sourceware yet?  If you plan on contributing to GDB regularly, it would be a good idea to get it, see [1] for details.  Otherwise I can push the patch for you, it's as you wish.
>
> FYI, I just pushed the patch that fixes the shlib= problem you had:
> https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=6181e9c2c5ba252ac016f51903dc35d7bfbbca71
>
> Simon
>
> [1] https://sourceware.org/cgi-bin/pdw/ps_form.cgi

Hi Simon,

There weren't any code changes when rebasing.  The only changes I have locally
are just updating the changelog dates, and adding the missing tabs in the
changelog section of the commit message.  Feel free to push the change already
posted, with any fixups you see as necessary.

Next time I have a change to be submitted, I'll look into getting a Sourceware
account.

Thanks,
Mike

Reply | Threaded
Open this post in threaded view
|

Re: [RFC][PATCH] fix gdb segv when objfile can't be opened

Simon Marchi-2
On 2018-01-12 10:04 AM, Mike Gulick wrote:

> Hi Simon,
>
> There weren't any code changes when rebasing.  The only changes I have locally
> are just updating the changelog dates, and adding the missing tabs in the
> changelog section of the commit message.  Feel free to push the change already
> posted, with any fixups you see as necessary.
>
> Next time I have a change to be submitted, I'll look into getting a Sourceware
> account.
>
> Thanks,
> Mike
>

Hi Mike,

I finally got around to push this.

Thanks again for the patch!

Simon
Reply | Threaded
Open this post in threaded view
|

Re: [RFC][PATCH] fix gdb segv when objfile can't be opened

Mike Gulick-2
On 01/17/2018 01:00 PM, Simon Marchi wrote:

> On 2018-01-12 10:04 AM, Mike Gulick wrote:
>> Hi Simon,
>>
>> There weren't any code changes when rebasing.  The only changes I have locally
>> are just updating the changelog dates, and adding the missing tabs in the
>> changelog section of the commit message.  Feel free to push the change already
>> posted, with any fixups you see as necessary.
>>
>> Next time I have a change to be submitted, I'll look into getting a Sourceware
>> account.
>>
>> Thanks,
>> Mike
>>
>
> Hi Mike,
>
> I finally got around to push this.
>
> Thanks again for the patch!
>
> Simon
>

Thank you Simon for the review, and for your help and advice.  I'm pleased to
have been able to make a contribution to GDB!

Mike