Jump visualization feature for objdump.

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Jump visualization feature for objdump.

Thomas Troeger
Dear all,

I have written a program that adds visualization of jumps inside a function to the output of objdump in the form of a post-processor

$ objdump -wzSCD binary | postprocessor

Is that feature interesting enough to include it into objdump, for example behind a command-line switch like `--visualize-jumps'? If yes, what is the workflow to add this feature? I could of course port it from my tool, which is written in C++14, but there is the question who will review a patch for inclusion when I have it finished, and what are other prerequisites (source code formatting, test cases ...)?

Please enlighten me with your answers.

Regards,
Thomas.

P.S.: Example output from a running `/bin/bash' process (the program does other stuff besides the visualization):

000055edb4520380 <unset_bash_input@@Base>:
    55edb4520380:          8b 05 9a 53 0e 00     mov    0xe539a(%rip),%eax        # 115720 <default_buffered_input@@Base> -> 55edb451d03d
    55edb4520386:          85 ff                 test   %edi,%edi
    55edb4520388: /-------- 75 3e                 jne    303c8 <unset_bash_input@@Base+0x48> -> 55edb45203c8
    55edb452038a: |         85 c0                 test   %eax,%eax
    55edb452038c: |     /-- 7e 32                 jle    303c0 <unset_bash_input@@Base+0x40> -> 55edb45203c0
    55edb452038e: |  /--|-> 48 83 ec 08           sub    $0x8,%rsp
    55edb4520392: |  |  |   89 c7                 mov    %eax,%edi
    55edb4520394: |  |  |   e8 f7 47 04 00       callq  74b90 <close_buffered_fd@@Base> -> 55edb4564b90
    55edb4520399: |  |  |   c7 05 ad 25 0f 00 ff ff ff ff movl   $0xffffffff,0xf25ad(%rip)        # 122950 <bash_input@@Base+0x10> -> 55edb45ca79d
    55edb45203a3: |  |  |   c7 05 73 53 0e 00 ff ff ff ff movl   $0xffffffff,0xe5373(%rip)        # 115720 <default_buffered_input@@Base> -> 55edb451d03d
    55edb45203ad: |  |  |   c7 05 89 25 0f 00 00 00 00 00 movl   $0x0,0xf2589(%rip)        # 122940 <bash_input@@Base> -> 55edb45ca78d
    55edb45203b7: |  |  |   48 83 c4 08           add    $0x8,%rsp
    55edb45203bb: |  |  |   c3                   retq
    55edb45203bc: |  |  |   0f 1f 40 00           nopl   0x0(%rax)
    55edb45203c0: |  |  \-> c3                   retq
    55edb45203c1: |  |      0f 1f 80 00 00 00 00 nopl   0x0(%rax)
    55edb45203c8: \--|----> 85 c0                 test   %eax,%eax
    55edb45203ca:   \----- 79 c2                 jns    3038e <unset_bash_input@@Base+0xe> -> 55edb452038e
    55edb45203cc:          c3                   retq
    55edb45203cd:          0f 1f 00             nopl   (%rax)

[...]

000055edb45218a0 <with_input_from_stdin@@Base>:
    55edb45218a0:                   83 3d 99 10 0f 00 01 cmpl   $0x1,0xf1099(%rip)        # 122940 <bash_input@@Base> -> 55edb45ca78d
    55edb45218a7: /----------------- 74 4f                 je     318f8 <with_input_from_stdin@@Base+0x58> -> 55edb45218f8
    55edb45218a9: |                  48 8b 05 d8 c5 0e 00 mov    0xec5d8(%rip),%rax        # 11de88 <stream_list@@Base> -> 55edb451d017
    55edb45218b0: |                  48 85 c0             test   %rax,%rax
    55edb45218b3: |        /-------- 74 19                 je     318ce <with_input_from_stdin@@Base+0x2e> -> 55edb45218ce
    55edb45218b5: |        |         83 78 08 01           cmpl   $0x1,0x8(%rax)
    55edb45218b9: |        |  /----- 75 0b                 jne    318c6 <with_input_from_stdin@@Base+0x26> -> 55edb45218c6
    55edb45218bb: |  /-----|--|----- eb 3c                 jmp    318f9 <with_input_from_stdin@@Base+0x59> -> 55edb45218f9
    55edb45218bd: |  |     |  |      0f 1f 00             nopl   (%rax)
    55edb45218c0: |  |     |  |  /-> 83 78 08 01           cmpl   $0x1,0x8(%rax)
    55edb45218c4: |  |  /--|--|--|-- 74 32                 je     318f8 <with_input_from_stdin@@Base+0x58> -> 55edb45218f8
    55edb45218c6: |  |  |  |  \--|-> 48 8b 00             mov    (%rax),%rax
    55edb45218c9: |  |  |  |     |   48 85 c0             test   %rax,%rax
    55edb45218cc: |  |  |  |     \-- 75 f2                 jne    318c0 <with_input_from_stdin@@Base+0x20> -> 55edb45218c0
    55edb45218ce: |  |  |  \-------> 4c 8b 05 cb c5 0e 00 mov    0xec5cb(%rip),%r8        # 11dea0 <current_readline_line@@Base> -> 55edb451d017
    55edb45218d5: |  |  |            48 8d 0d 85 9d 0a 00 lea    0xa9d85(%rip),%rcx        # db661 <_IO_stdin_used@@Base+0x661>
    55edb45218dc: |  |  |            ba 01 00 00 00       mov    $0x1,%edx
    55edb45218e1: |  |  |            48 8d 35 a8 f1 ff ff lea    -0xe58(%rip),%rsi        # 30a90 <pretty_print_loop@@Base+0xd0> -> 55edb4520a90
    55edb45218e8: |  |  |            48 8d 3d 51 f4 ff ff lea    -0xbaf(%rip),%rdi        # 30d40 <pretty_print_loop@@Base+0x380> -> 55edb4520d40
    55edb45218ef: |  |  |            e9 0c ff ff ff       jmpq   31800 <init_yy_io@@Base> -> 55edb4521800
    55edb45218f4: |  |  |            0f 1f 40 00           nopl   0x0(%rax)
    55edb45218f8: \--|--\----------> c3                   retq
    55edb45218f9:   \-------------> c3                   retq
    55edb45218fa:                   66 0f 1f 44 00 00     nopw   0x0(%rax,%rax,1)
Reply | Threaded
Open this post in threaded view
|

Re: Jump visualization feature for objdump.

Carlos O'Donell-6
On 11/6/19 2:09 PM, Thomas Troeger wrote:

> I have written a program that adds visualization of jumps inside a
> function to the output of objdump in the form of a post-processor
>
> $ objdump -wzSCD binary | postprocessor
>
> Is that feature interesting enough to include it into objdump, for
> example behind a command-line switch like `--visualize-jumps'? If
> yes, what is the workflow to add this feature? I could of course port
> it from my tool, which is written in C++14, but there is the question
> who will review a patch for inclusion when I have it finished, and
> what are other prerequisites (source code formatting, test cases
> ...)?
>
> Please enlighten me with your answers.

This is awesome. I do this all the time with a pencil!

We should allow using C++14 binutils! :)

--
Cheers,
Carlos.

Reply | Threaded
Open this post in threaded view
|

Re: Jump visualization feature for objdump.

Fangrui Song-2
In reply to this post by Thomas Troeger
On 2019-11-06, Thomas Troeger wrote:

>Dear all,
>
>I have written a program that adds visualization of jumps inside a function to the output of objdump in the form of a post-processor
>
>$ objdump -wzSCD binary | postprocessor
>
>Is that feature interesting enough to include it into objdump, for example behind a command-line switch like `--visualize-jumps'? If yes, what is the workflow to add this feature? I could of course port it from my tool, which is written in C++14, but there is the question who will review a patch for inclusion when I have it finished, and what are other prerequisites (source code formatting, test cases ...)?
>
>Please enlighten me with your answers.
>
>Regards,
>Thomas.
>
>P.S.: Example output from a running `/bin/bash' process (the program does other stuff besides the visualization):
>
>000055edb4520380 <unset_bash_input@@Base>:
>    55edb4520380:          8b 05 9a 53 0e 00     mov    0xe539a(%rip),%eax        # 115720 <default_buffered_input@@Base> -> 55edb451d03d
>    55edb4520386:          85 ff                 test   %edi,%edi
>    55edb4520388: /-------- 75 3e                 jne    303c8 <unset_bash_input@@Base+0x48> -> 55edb45203c8
>    55edb452038a: |         85 c0                 test   %eax,%eax
>    55edb452038c: |     /-- 7e 32                 jle    303c0 <unset_bash_input@@Base+0x40> -> 55edb45203c0
>    55edb452038e: |  /--|-> 48 83 ec 08           sub    $0x8,%rsp
>    55edb4520392: |  |  |   89 c7                 mov    %eax,%edi
>    55edb4520394: |  |  |   e8 f7 47 04 00       callq  74b90 <close_buffered_fd@@Base> -> 55edb4564b90
>    55edb4520399: |  |  |   c7 05 ad 25 0f 00 ff ff ff ff movl   $0xffffffff,0xf25ad(%rip)        # 122950 <bash_input@@Base+0x10> -> 55edb45ca79d
>    55edb45203a3: |  |  |   c7 05 73 53 0e 00 ff ff ff ff movl   $0xffffffff,0xe5373(%rip)        # 115720 <default_buffered_input@@Base> -> 55edb451d03d
>    55edb45203ad: |  |  |   c7 05 89 25 0f 00 00 00 00 00 movl   $0x0,0xf2589(%rip)        # 122940 <bash_input@@Base> -> 55edb45ca78d
>    55edb45203b7: |  |  |   48 83 c4 08           add    $0x8,%rsp
>    55edb45203bb: |  |  |   c3                   retq
>    55edb45203bc: |  |  |   0f 1f 40 00           nopl   0x0(%rax)
>    55edb45203c0: |  |  \-> c3                   retq
>    55edb45203c1: |  |      0f 1f 80 00 00 00 00 nopl   0x0(%rax)
>    55edb45203c8: \--|----> 85 c0                 test   %eax,%eax
>    55edb45203ca:   \----- 79 c2                 jns    3038e <unset_bash_input@@Base+0xe> -> 55edb452038e
>    55edb45203cc:          c3                   retq
>    55edb45203cd:          0f 1f 00             nopl   (%rax)
>
>[...]
>
>000055edb45218a0 <with_input_from_stdin@@Base>:
>    55edb45218a0:                   83 3d 99 10 0f 00 01 cmpl   $0x1,0xf1099(%rip)        # 122940 <bash_input@@Base> -> 55edb45ca78d
>    55edb45218a7: /----------------- 74 4f                 je     318f8 <with_input_from_stdin@@Base+0x58> -> 55edb45218f8
>    55edb45218a9: |                  48 8b 05 d8 c5 0e 00 mov    0xec5d8(%rip),%rax        # 11de88 <stream_list@@Base> -> 55edb451d017
>    55edb45218b0: |                  48 85 c0             test   %rax,%rax
>    55edb45218b3: |        /-------- 74 19                 je     318ce <with_input_from_stdin@@Base+0x2e> -> 55edb45218ce
>    55edb45218b5: |        |         83 78 08 01           cmpl   $0x1,0x8(%rax)
>    55edb45218b9: |        |  /----- 75 0b                 jne    318c6 <with_input_from_stdin@@Base+0x26> -> 55edb45218c6
>    55edb45218bb: |  /-----|--|----- eb 3c                 jmp    318f9 <with_input_from_stdin@@Base+0x59> -> 55edb45218f9
>    55edb45218bd: |  |     |  |      0f 1f 00             nopl   (%rax)
>    55edb45218c0: |  |     |  |  /-> 83 78 08 01           cmpl   $0x1,0x8(%rax)
>    55edb45218c4: |  |  /--|--|--|-- 74 32                 je     318f8 <with_input_from_stdin@@Base+0x58> -> 55edb45218f8
>    55edb45218c6: |  |  |  |  \--|-> 48 8b 00             mov    (%rax),%rax
>    55edb45218c9: |  |  |  |     |   48 85 c0             test   %rax,%rax
>    55edb45218cc: |  |  |  |     \-- 75 f2                 jne    318c0 <with_input_from_stdin@@Base+0x20> -> 55edb45218c0
>    55edb45218ce: |  |  |  \-------> 4c 8b 05 cb c5 0e 00 mov    0xec5cb(%rip),%r8        # 11dea0 <current_readline_line@@Base> -> 55edb451d017
>    55edb45218d5: |  |  |            48 8d 0d 85 9d 0a 00 lea    0xa9d85(%rip),%rcx        # db661 <_IO_stdin_used@@Base+0x661>
>    55edb45218dc: |  |  |            ba 01 00 00 00       mov    $0x1,%edx
>    55edb45218e1: |  |  |            48 8d 35 a8 f1 ff ff lea    -0xe58(%rip),%rsi        # 30a90 <pretty_print_loop@@Base+0xd0> -> 55edb4520a90
>    55edb45218e8: |  |  |            48 8d 3d 51 f4 ff ff lea    -0xbaf(%rip),%rdi        # 30d40 <pretty_print_loop@@Base+0x380> -> 55edb4520d40
>    55edb45218ef: |  |  |            e9 0c ff ff ff       jmpq   31800 <init_yy_io@@Base> -> 55edb4521800
>    55edb45218f4: |  |  |            0f 1f 40 00           nopl   0x0(%rax)
>    55edb45218f8: \--|--\----------> c3                   retq
>    55edb45218f9:   \-------------> c3                   retq
>    55edb45218fa:                   66 0f 1f 44 00 00     nopw   0x0(%rax,%rax,1)

radare2[1] can draw such edges and even control flow graphs[2] in the terminal.
I wonder what people think of doing more "UI" work in the standard
disassembly utility, objdump.

(radare2 uses capstone as its default disassembler backend.  capstone was
 created from rewriting part of 2014 llvm MC C++ code in C. IIRC it is
 more difficult to upgrade to a newer llvm, than rewriting it.)

[1]: https://rada.re/n/
[2]: https://monosource.gitbooks.io/radare2-explorations/content/intro/visual_graphs.html
Reply | Threaded
Open this post in threaded view
|

Re: Jump visualization feature for objdump.

Nick Clifton
In reply to this post by Thomas Troeger
Hi Thomas,

> I have written a program that adds visualization of jumps inside a function to the output of objdump in the form of a post-processor
>
> $ objdump -wzSCD binary | postprocessor
>
> Is that feature interesting enough to include it into objdump,

Given that you already have the tool working, and that there appears to be
another tool which provides a similar function, what would be the benefit
of incorporating your code into objdump ?  Ie if the stand alone tool is
already working, why make things more complicated by moving it into a
another code base, written in another language ?

As an alternative, how about adding the tool to the binutils as a separate
utility ?  There are already quite a few tools in the binutils collection,
so adding one more would not be a problem.  Well except for the following:

  * Is the tool x86/x86_64 specific ?

  The binutils cover a lot of different architectures and utilities it
  provides ought to be able to work with more than just one or two.

  (Actually multi-architecture support might be a good argument for
  integrating the code into objdump, since the target specific
  disassembly functions in the opcodes library are the places where
  the knowledge of branch and jump instructions can be found).

  * Are you willing to contribute the code ?

  I guess this was implied by your original posting, but just to be clear
  contributing this code would mean filling out a copyright assignment to
  the FSF, and agreeing to have the code licenced under the GPLv3.

  * Are you willing to maintain the code ?

  The long term usability of any tool depends upon having maintainers
  to look after it, otherwise it just bit-rots away.  Being a maintainer
  for the code should not be an onerous job, but it does mean being
  willing to look at bug reports (specific to the feature) when they
  come in.


> but there is the question who will review a patch

We will.  If you post a patch, or a patch series, to this list, it will be
reviewed.  Sometimes you may have to prod us a bit, but it will be reviewed
in the end.  Just please remember that we will need a copyright assignment
in place before we can accept any code.


> (source code formatting, test cases ...)?

We (try to) follow the GNU Coding Standards:

  https://www.gnu.org/prep/standards/standards.html

Test cases are always a good idea, especially when adding a new feature.
When testing the tests, please make sure that you try using more than just
a native toolchain.  Checking with a binutils configure with "--enable-targets=all"
for example is always worth doing.

Also for new features adding an entry to the appropriate NEWS file
(binutils/NEWS in this case) is also required.

Patches should be accompanied by proposed entries for the ChangeLogs.
Ideally these would just be in plain text as context diffs almost
never apply cleanly.


> Please enlighten me with your answers.

Be at one with the universe.  

Oh, sorry, you meant enlighten you about the binutils patch submission
process.  For that, see above.

Cheers
  Nick

Reply | Threaded
Open this post in threaded view
|

Re: Jump visualization feature for objdump.

Tom Tromey-2
>>>>> "Nick" == Nick Clifton <[hidden email]> writes:

Nick> Given that you already have the tool working, and that there appears to be
Nick> another tool which provides a similar function, what would be the benefit
Nick> of incorporating your code into objdump ?

FWIW, if it's in a reusable form in the tree, it could be enabled in gdb
as well.

thanks,
Tom
Reply | Threaded
Open this post in threaded view
|

Re: Jump visualization feature for objdump.

Thomas Troeger
In reply to this post by Nick Clifton
> Given that you already have the tool working, and that there appears to be
> another tool which provides a similar function, what would be the benefit
> of incorporating your code into objdump ?  Ie if the stand alone tool is
> already working, why make things more complicated by moving it into a
> another code base, written in another language ?

In the current form the tool only checks if the target address is inside
the same function and then adds a line between the two addresses. So
there is still work to do because I only want a line for instructions
that branch or jump. I figure that inside objdump itself that information
is available in some form, so it feels like a natural choice to integrate
it. I might need some help on how to check for jump instructions for a
given ISA, but I will try to figure it out on my own first.

I checked the code for objdump and think it can be done with a
reasonable amount of work. The main drawback is that I need to port it
to C, but the code is not that complicated or large and already works
with both std::list or std::vector, so I could use the list variant of
the algorithm and replace this container with some pointer to next
inside the appropriate struct to get similar functionality.

The other motivation is that it may be useful to the users of
objdump, in the end that's why I've written it. It might work as a
separate tool also (after all it does for me), but I would still have
to port it to C, right? Or is C++ an option for binutils?

> As an alternative, how about adding the tool to the binutils as a separate
> utility ?  There are already quite a few tools in the binutils collection,
> so adding one more would not be a problem.  Well except for the following:
>
>   * Is the tool x86/x86_64 specific ?
>
>   The binutils cover a lot of different architectures and utilities it
>   provides ought to be able to work with more than just one or two.
>
>   (Actually multi-architecture support might be a good argument for
>   integrating the code into objdump, since the target specific
>   disassembly functions in the opcodes library are the places where
>   the knowledge of branch and jump instructions can be found).

s. above. In the current form it only extracts address references so it
*should* work on all architectures. I tested it on both x86-64 and
ARM64.

>
>   * Are you willing to contribute the code ?

Of course.

>   * Are you willing to maintain the code ?
>
>   The long term usability of any tool depends upon having maintainers
>   to look after it, otherwise it just bit-rots away.  Being a maintainer
>   for the code should not be an onerous job, but it does mean being
>   willing to look at bug reports (specific to the feature) when they
>   come in.

As long as it does not eat a substantial amount of time I see no problem there.

Cheers,
Thomas.
Reply | Threaded
Open this post in threaded view
|

Re: Jump visualization feature for objdump.

Nick Clifton
Hi Thomas,

> The other motivation is that it may be useful to the users of
> objdump, in the end that's why I've written it. It might work as a
> separate tool also (after all it does for me), but I would still have
> to port it to C, right? Or is C++ an option for binutils?

OK, so just to be clear, we are interested in having such a feature
inside objdump and if you are willing to create a patch or two we
would be very happy to review them.

Cheers
  Nick

Reply | Threaded
Open this post in threaded view
|

Re: Jump visualization feature for objdump.

Thomas Troeger
> Hi Thomas,

Hi there!

> OK, so just to be clear, we are interested in having such a feature
> inside objdump and if you are willing to create a patch or two we
> would be very happy to review them.

I have attached the current version of my patches.
It's possible to create the visualization using

objdump -wzd --visualize-jumps --extended-color /bin/bash

There are still some fixmes and I guess the code still needs tweaks. I
tried using the same formatting as the code around it, but I cannot
guarantee that it is consistent, but here you go.

Cheers!
Thomas.

binutils-2.33.1-a6d9a35.patch (30K) Download Attachment