V3 [PATCH 0/4] i386: Optimize for Jump Conditional Code Erratum

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

V3 [PATCH 0/4] i386: Optimize for Jump Conditional Code Erratum

H.J. Lu-30
Microcode update for Jump Conditional Code Erratum may cause performance
loss for some workloads:

https://www.intel.com/content/www/us/en/support/articles/000055650.html

Here is the set of assembler patches to mitigate performance impact by
aligning branches within 32-byte boundary.  The impacted instructions
are:

  a. Conditional jump.
  b. Fused conditional jump.
  c. Unconditional jump.
  d. Call.
  e. Ret.
  f. Indirect jump and call.

The new -mbranches-within-32B-boundaries command-line option aligns
conditional jump, fused conditional jump and unconditional jump within
32-byte boundary.

md_generic_table_relax_frag, which allows a backend to extend relax_frag,
is added to implement this new feature.


*** BLURB HERE ***

H.J. Lu (4):
  gas: Add md_generic_table_relax_frag
  i386: Align branches within a fixed boundary
  i386: Add -mbranches-within-32B-boundaries
  i386: Add tests for -malign-branch-boundary and -malign-branch

 gas/config/tc-i386.c                          | 1059 ++++++++++++++++-
 gas/config/tc-i386.h                          |   31 +
 gas/doc/c-i386.texi                           |   37 +
 gas/doc/internals.texi                        |    5 +
 gas/testsuite/gas/i386/align-branch-1.s       |   72 ++
 gas/testsuite/gas/i386/align-branch-1a.d      |   77 ++
 gas/testsuite/gas/i386/align-branch-1b.d      |   77 ++
 gas/testsuite/gas/i386/align-branch-1c.d      |   77 ++
 gas/testsuite/gas/i386/align-branch-1d.d      |   76 ++
 gas/testsuite/gas/i386/align-branch-1e.d      |   77 ++
 gas/testsuite/gas/i386/align-branch-1f.d      |   77 ++
 gas/testsuite/gas/i386/align-branch-1g.d      |   77 ++
 gas/testsuite/gas/i386/align-branch-1h.d      |   76 ++
 gas/testsuite/gas/i386/align-branch-1i.d      |   80 ++
 gas/testsuite/gas/i386/align-branch-2.s       |   49 +
 gas/testsuite/gas/i386/align-branch-2a.d      |   55 +
 gas/testsuite/gas/i386/align-branch-2b.d      |   55 +
 gas/testsuite/gas/i386/align-branch-2c.d      |   55 +
 gas/testsuite/gas/i386/align-branch-3.d       |   33 +
 gas/testsuite/gas/i386/align-branch-3.s       |   28 +
 gas/testsuite/gas/i386/align-branch-4.s       |   30 +
 gas/testsuite/gas/i386/align-branch-4a.d      |   36 +
 gas/testsuite/gas/i386/align-branch-4b.d      |   36 +
 gas/testsuite/gas/i386/align-branch-5.d       |   36 +
 gas/testsuite/gas/i386/align-branch-5.s       |   32 +
 gas/testsuite/gas/i386/align-branch-6.d       |   22 +
 gas/testsuite/gas/i386/align-branch-6.e       |    2 +
 gas/testsuite/gas/i386/align-branch-6.s       |    7 +
 gas/testsuite/gas/i386/align-branch-7.d       |   18 +
 gas/testsuite/gas/i386/align-branch-7.s       |   14 +
 gas/testsuite/gas/i386/align-branch-8.d       |   18 +
 gas/testsuite/gas/i386/align-branch-8.s       |   14 +
 gas/testsuite/gas/i386/i386.exp               |   45 +
 .../gas/i386/x86-64-align-branch-1.s          |   70 ++
 .../gas/i386/x86-64-align-branch-1a.d         |   75 ++
 .../gas/i386/x86-64-align-branch-1b.d         |   75 ++
 .../gas/i386/x86-64-align-branch-1c.d         |   75 ++
 .../gas/i386/x86-64-align-branch-1d.d         |   74 ++
 .../gas/i386/x86-64-align-branch-1e.d         |   74 ++
 .../gas/i386/x86-64-align-branch-1f.d         |   75 ++
 .../gas/i386/x86-64-align-branch-1g.d         |   75 ++
 .../gas/i386/x86-64-align-branch-1h.d         |   74 ++
 .../gas/i386/x86-64-align-branch-1i.d         |   78 ++
 .../gas/i386/x86-64-align-branch-2.s          |   44 +
 .../gas/i386/x86-64-align-branch-2a.d         |   50 +
 .../gas/i386/x86-64-align-branch-2b.d         |   50 +
 .../gas/i386/x86-64-align-branch-2c.d         |   50 +
 .../gas/i386/x86-64-align-branch-3.d          |   32 +
 .../gas/i386/x86-64-align-branch-3.s          |   27 +
 .../gas/i386/x86-64-align-branch-4.s          |   27 +
 .../gas/i386/x86-64-align-branch-4a.d         |   33 +
 .../gas/i386/x86-64-align-branch-4b.d         |   33 +
 .../gas/i386/x86-64-align-branch-5.d          |   37 +
 .../gas/i386/x86-64-align-branch-6.d          |   19 +
 .../gas/i386/x86-64-align-branch-7.d          |   18 +
 .../gas/i386/x86-64-align-branch-7.s          |   14 +
 .../gas/i386/x86-64-align-branch-8.d          |   18 +
 .../gas/i386/x86-64-align-branch-8.s          |   14 +
 gas/write.c                                   |    7 +-
 ld/testsuite/ld-i386/align-branch-1.d         |   25 +
 ld/testsuite/ld-i386/align-branch-1.s         |   19 +
 ld/testsuite/ld-i386/i386.exp                 |    1 +
 ld/testsuite/ld-x86-64/align-branch-1.d       |   21 +
 ld/testsuite/ld-x86-64/align-branch-1.s       |   17 +
 ld/testsuite/ld-x86-64/x86-64.exp             |    1 +
 65 files changed, 3781 insertions(+), 4 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/align-branch-1.s
 create mode 100644 gas/testsuite/gas/i386/align-branch-1a.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-1b.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-1c.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-1d.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-1e.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-1f.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-1g.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-1h.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-1i.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-2.s
 create mode 100644 gas/testsuite/gas/i386/align-branch-2a.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-2b.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-2c.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-3.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-3.s
 create mode 100644 gas/testsuite/gas/i386/align-branch-4.s
 create mode 100644 gas/testsuite/gas/i386/align-branch-4a.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-4b.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-5.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-5.s
 create mode 100644 gas/testsuite/gas/i386/align-branch-6.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-6.e
 create mode 100644 gas/testsuite/gas/i386/align-branch-6.s
 create mode 100644 gas/testsuite/gas/i386/align-branch-7.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-7.s
 create mode 100644 gas/testsuite/gas/i386/align-branch-8.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-8.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-1.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-1a.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-1b.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-1c.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-1d.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-1e.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-1f.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-1g.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-1h.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-1i.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-2.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-2a.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-2b.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-2c.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-3.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-3.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-4.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-4a.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-4b.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-5.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-6.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-7.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-7.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-8.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-8.s
 create mode 100644 ld/testsuite/ld-i386/align-branch-1.d
 create mode 100644 ld/testsuite/ld-i386/align-branch-1.s
 create mode 100644 ld/testsuite/ld-x86-64/align-branch-1.d
 create mode 100644 ld/testsuite/ld-x86-64/align-branch-1.s

--
2.21.0

Reply | Threaded
Open this post in threaded view
|

V3 [PATCH 1/4] gas: Add md_generic_table_relax_frag

H.J. Lu-30
Add md_generic_table_relax_frag for TC_GENERIC_RELAX_TABLE targets so
that a backend can extend relax_frag beyond TC_GENERIC_RELAX_TABLE.

        * write.c (relax_segment): Call md_generic_table_relax_frag
        instead of relax_frag if defined.
---
 gas/doc/internals.texi | 5 +++++
 gas/write.c            | 7 ++++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/gas/doc/internals.texi b/gas/doc/internals.texi
index a50880d635..cb86b5b362 100644
--- a/gas/doc/internals.texi
+++ b/gas/doc/internals.texi
@@ -1210,6 +1210,11 @@ If you do not define @code{md_relax_frag}, you may define
 machine independent code knows how to use such a table to relax PC relative
 references.  See @file{tc-m68k.c} for an example.  @xref{Relaxation}.
 
+@item md_generic_table_relax_frag
+@cindex md_generic_table_relax_frag
+If defined, it is a C statement that is invoked, instead of
+the default implementation, to scan @code{TC_GENERIC_RELAX_TABLE}.
+
 @item md_prepare_relax_scan
 @cindex md_prepare_relax_scan
 If defined, it is a C statement that is invoked prior to scanning
diff --git a/gas/write.c b/gas/write.c
index d5da41850c..d2bdb7acdf 100644
--- a/gas/write.c
+++ b/gas/write.c
@@ -2481,6 +2481,10 @@ write_object_file (void)
 }
 
 #ifdef TC_GENERIC_RELAX_TABLE
+#ifndef md_generic_table_relax_frag
+#define md_generic_table_relax_frag relax_frag
+#endif
+
 /* Relax a fragment by scanning TC_GENERIC_RELAX_TABLE.  */
 
 long
@@ -3031,7 +3035,8 @@ relax_segment (struct frag *segment_frag_root, segT segment, int pass)
 #ifdef TC_GENERIC_RELAX_TABLE
  /* The default way to relax a frag is to look through
    TC_GENERIC_RELAX_TABLE.  */
- growth = relax_frag (segment, fragP, stretch);
+ growth = md_generic_table_relax_frag (segment, fragP,
+      stretch);
 #endif /* TC_GENERIC_RELAX_TABLE  */
 #endif
  break;
--
2.21.0

Reply | Threaded
Open this post in threaded view
|

V3 [PATCH 2/4] i386: Align branches within a fixed boundary

H.J. Lu-30
In reply to this post by H.J. Lu-30
Add 3 command-line options to align branches within a fixed boundary
with segment prefixes or NOPs:

1. -malign-branch-boundary=NUM aligns branches within NUM byte boundary.
2. -malign-branch=TYPE[+TYPE...] specifies types of branches to align.
The supported branches are:
  a. Conditional jump.
  b. Fused conditional jump.
  c. Unconditional jump.
  d. Call.
  e. Ret.
  f. Indirect jump and call.
3. -malign-branch-prefix-size=NUM aligns branches with NUM segment
prefixes per instruction.

3 new rs_machine_dependent frag types are added:

1. BRANCH_PADDING.  The variable size frag to insert NOP before branch.
2. BRANCH_PREFIX.  The variable size frag to insert segment prefixes to
an instruction.  The choices of prefixes are:
   a. Use the existing segment prefix if there is one.
   b. Use CS segment prefix in 64-bit mode.
   c. In 32-bit mode, use SS segment prefix with ESP/EBP base register
   and use DS segment prefix without ESP/EBP base register.
3. FUSED_JCC_PADDING.  The variable size frag to insert NOP before fused
conditional jump.

The new rs_machine_dependent frags aren't inserted if the previous item
is a prefix or a constant directive, which may be used to hardcode an
instruction, since there is no clear instruction boundary.  Segment
prefixes and NOP padding are disabled before relaxable TLS relocations
and tls_get_addr calls to keep TLS instruction sequence unchanged.

md_estimate_size_before_relax() and i386_generic_table_relax_frag() are
used to handled BRANCH_PADDING, BRANCH_PREFIX and FUSED_JCC_PADDING frags.
i386_generic_table_relax_frag() grows or shrinks sizes of segment prefix
and NOP to align the next branch frag:

1. First try to add segment prefixes to instructions before a branch.
2. If there is no sufficient room to add segment prefixes, NOP will be
inserted before a branch.

        * config/tc-i386.c (_i386_insn): Add has_gotpc_tls_reloc.
        (tls_get_addr): New.
        (last_insn): New.
        (align_branch_power): New.
        (align_branch_kind): New.
        (align_branch_bit): New.
        (align_branch): New.
        (MAX_FUSED_JCC_PADDING_SIZE): New.
        (align_branch_prefix_size): New.
        (BRANCH_PADDING): New.
        (BRANCH_PREFIX): New.
        (FUSED_JCC_PADDING): New.
        (i386_generate_nops): Support BRANCH_PADDING and FUSED_JCC_PADDING.
        (md_begin): Abort if align_branch_prefix_size <
        MAX_FUSED_JCC_PADDING_SIZE.
        (md_assemble): Set last_insn.
        (maybe_fused_with_jcc_p): New.
        (add_fused_jcc_padding_frag_p): New.
        (add_branch_prefix_frag_p): New.
        (add_branch_padding_frag_p): New.
        (output_insn): Generate a BRANCH_PADDING, FUSED_JCC_PADDING or
        BRANCH_PREFIX frag and terminate each frag to align branches.
        (output_disp): Set i.has_gotpc_tls_reloc to TRUE for GOTPC and
        relaxable TLS relocations.
        (output_imm): Likewise.
        (i386_next_non_empty_frag): New.
        (i386_next_jcc_frag): New.
        (i386_classify_machine_dependent_frag): New.
        (i386_branch_padding_size): New.
        (i386_generic_table_relax_frag): New.
        (md_estimate_size_before_relax): Handle COND_JUMP_PADDING,
        FUSED_JCC_PADDING and COND_JUMP_PREFIX frags.
        (md_convert_frag): Handle BRANCH_PADDING, BRANCH_PREFIX and
        FUSED_JCC_PADDING frags.
        (OPTION_MALIGN_BRANCH_BOUNDARY): New.
        (OPTION_MALIGN_BRANCH_PREFIX_SIZE): New.
        (OPTION_MALIGN_BRANCH): New.
        (md_longopts): Add -malign-branch-boundary=,
        -malign-branch-prefix-size= and -malign-branch=.
        (md_parse_option): Handle -malign-branch-boundary=,
        -malign-branch-prefix-size= and -malign-branch=.
        (md_show_usage): Display -malign-branch-boundary=,
        -malign-branch-prefix-size= and -malign-branch=.
        (i386_target_format): Set tls_get_addr.
        (i386_cons_align): New.
        * config/tc-i386.h (i386_cons_align): New.
        (md_cons_align): New.
        (i386_generic_table_relax_frag): New.
        (md_generic_table_relax_frag): New.
        (i386_tc_frag_data): Add u, padding_address, length,
        max_prefix_length, prefix_length, default_prefix, cmp_size,
        classified and branch_type.
        (TC_FRAG_INIT): Initialize u, padding_address, length,
        max_prefix_length, prefix_length, default_prefix, cmp_size,
        classified and branch_type.
        * doc/c-i386.texi: Document -malign-branch-boundary=,
        -malign-branch= and -malign-branch-prefix-size=.
---
 gas/config/tc-i386.c | 1046 +++++++++++++++++++++++++++++++++++++++++-
 gas/config/tc-i386.h |   31 ++
 gas/doc/c-i386.texi  |   26 ++
 3 files changed, 1100 insertions(+), 3 deletions(-)

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index b62af34268..0ab6651f24 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -368,6 +368,9 @@ struct _i386_insn
     /* Has ZMM register operands.  */
     bfd_boolean has_regzmm;
 
+    /* Has GOTPC or TLS relocation.  */
+    bfd_boolean has_gotpc_tls_reloc;
+
     /* RM and SIB are the modrm byte and the sib byte where the
        addressing modes of this insn are encoded.  */
     modrm_byte rm;
@@ -562,6 +565,8 @@ static enum flag_code flag_code;
 static unsigned int object_64bit;
 static unsigned int disallow_64bit_reloc;
 static int use_rela_relocations = 0;
+/* __tls_get_addr/___tls_get_addr symbol for TLS.  */
+static const char *tls_get_addr;
 
 #if ((defined (OBJ_MAYBE_COFF) && defined (OBJ_MAYBE_AOUT)) \
      || defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF) \
@@ -622,6 +627,21 @@ static int omit_lock_prefix = 0;
    "lock addl $0, (%{re}sp)".  */
 static int avoid_fence = 0;
 
+/* Type of the previous instruction.  */
+static struct
+  {
+    segT seg;
+    const char *file;
+    const char *name;
+    unsigned int line;
+    enum last_insn_kind
+      {
+ last_insn_other = 0,
+ last_insn_directive,
+ last_insn_prefix
+      } kind;
+  } last_insn;
+
 /* 1 if the assembler should generate relax relocations.  */
 
 static int generate_relax_relocations
@@ -635,6 +655,44 @@ static enum check_kind
   }
 sse_check, operand_check = check_warning;
 
+/* Non-zero if branches should be aligned within power of 2 boundary.  */
+static int align_branch_power = 0;
+
+/* Types of branches to align.  */
+enum align_branch_kind
+  {
+    align_branch_none = 0,
+    align_branch_jcc = 1,
+    align_branch_fused = 2,
+    align_branch_jmp = 3,
+    align_branch_call = 4,
+    align_branch_indirect = 5,
+    align_branch_ret = 6
+  };
+
+/* Type bits of branches to align.  */
+enum align_branch_bit
+  {
+    align_branch_jcc_bit = 1 << align_branch_jcc,
+    align_branch_fused_bit = 1 << align_branch_fused,
+    align_branch_jmp_bit = 1 << align_branch_jmp,
+    align_branch_call_bit = 1 << align_branch_call,
+    align_branch_indirect_bit = 1 << align_branch_indirect,
+    align_branch_ret_bit = 1 << align_branch_ret
+  };
+
+static unsigned int align_branch = (align_branch_jcc_bit
+    | align_branch_fused_bit
+    | align_branch_jmp_bit);
+
+/* The maximum padding size for fused jcc.  CMP like instruction can
+   be 9 bytes and jcc can be 6 bytes.  Leave room just in case for
+   prefixes.   */
+#define MAX_FUSED_JCC_PADDING_SIZE 20
+
+/* The maximum number of prefixes added for an instruction.  */
+static unsigned int align_branch_prefix_size = 5;
+
 /* Optimization:
    1. Clear the REX_W bit with register operand if possible.
    2. Above plus use 128bit vector instruction to clear the full vector
@@ -738,12 +796,19 @@ int x86_cie_data_alignment;
 /* Interface to relax_segment.
    There are 3 major relax states for 386 jump insns because the
    different types of jumps add different sizes to frags when we're
-   figuring out what sort of jump to choose to reach a given label.  */
+   figuring out what sort of jump to choose to reach a given label.
+
+   BRANCH_PADDING, BRANCH_PREFIX and FUSED_JCC_PADDING are used to align
+   branches which are handled by md_estimate_size_before_relax() and
+   i386_generic_table_relax_frag().  */
 
 /* Types.  */
 #define UNCOND_JUMP 0
 #define COND_JUMP 1
 #define COND_JUMP86 2
+#define BRANCH_PADDING 3
+#define BRANCH_PREFIX 4
+#define FUSED_JCC_PADDING 5
 
 /* Sizes.  */
 #define CODE16 1
@@ -1384,6 +1449,12 @@ i386_generate_nops (fragS *fragP, char *where, offsetT count, int limit)
     case rs_fill_nop:
     case rs_align_code:
       break;
+    case rs_machine_dependent:
+      /* Allow NOP padding for jumps and calls.  */
+      if (TYPE_FROM_RELAX_STATE (fragP->fr_subtype) == BRANCH_PADDING
+  || TYPE_FROM_RELAX_STATE (fragP->fr_subtype) == FUSED_JCC_PADDING)
+ break;
+      /* Fall through.  */
     default:
       return;
     }
@@ -1528,7 +1599,7 @@ i386_generate_nops (fragS *fragP, char *where, offsetT count, int limit)
   return;
  }
     }
-  else
+  else if (fragP->fr_type != rs_machine_dependent)
     fragP->fr_var = count;
 
   if ((count / max_single_nop_size) > max_number_of_nops)
@@ -3011,6 +3082,11 @@ md_begin (void)
       x86_dwarf2_return_column = 8;
       x86_cie_data_alignment = -4;
     }
+
+  /* NB: FUSED_JCC_PADDING frag must have sufficient room so that it
+     can be turned into BRANCH_PREFIX frag.  */
+  if (align_branch_prefix_size > MAX_FUSED_JCC_PADDING_SIZE)
+    abort ();
 }
 
 void
@@ -4536,6 +4612,17 @@ md_assemble (char *line)
 
   /* We are ready to output the insn.  */
   output_insn ();
+
+  last_insn.seg = now_seg;
+
+  if (i.tm.opcode_modifier.isprefix)
+    {
+      last_insn.kind = last_insn_prefix;
+      last_insn.name = i.tm.name;
+      last_insn.file = as_where (&last_insn.line);
+    }
+  else
+    last_insn.kind = last_insn_other;
 }
 
 static char *
@@ -8193,11 +8280,206 @@ encoding_length (const fragS *start_frag, offsetT start_off,
   return len - start_off + (frag_now_ptr - frag_now->fr_literal);
 }
 
+/* Return 1 for test, and, cmp, add, sub, inc and dec which may
+   be macro-fused with conditional jumps.  */
+
+static int
+maybe_fused_with_jcc_p (void)
+{
+  /* No RIP address.  */
+  if (i.base_reg && i.base_reg->reg_num == RegIP)
+    return 0;
+
+  /* No VEX/EVEX encoding.  */
+  if (is_any_vex_encoding (&i.tm))
+    return 0;
+
+  /* and, add, sub with destination register.  */
+  if ((i.tm.base_opcode >= 0x20 && i.tm.base_opcode <= 0x25)
+      || i.tm.base_opcode <= 5
+      || (i.tm.base_opcode >= 0x28 && i.tm.base_opcode <= 0x2d)
+      || ((i.tm.base_opcode | 3) == 0x83
+  && ((i.tm.extension_opcode | 1) == 0x5
+      || i.tm.extension_opcode == 0x0)))
+    return (i.types[1].bitfield.class == Reg
+    || i.types[1].bitfield.instance == Accum);
+
+  /* test, cmp with any register.  */
+  if ((i.tm.base_opcode | 1) == 0x85
+      || (i.tm.base_opcode | 1) == 0xa9
+      || ((i.tm.base_opcode | 1) == 0xf7
+  && i.tm.extension_opcode == 0)
+      || (i.tm.base_opcode >= 0x38 && i.tm.base_opcode <= 0x3d)
+      || ((i.tm.base_opcode | 3) == 0x83
+  && (i.tm.extension_opcode == 0x7)))
+    return (i.types[0].bitfield.class == Reg
+    || i.types[0].bitfield.instance == Accum
+    || i.types[1].bitfield.class == Reg
+    || i.types[1].bitfield.instance == Accum);
+
+  /* inc, dec with any register.   */
+  if ((i.tm.cpu_flags.bitfield.cpuno64
+       && (i.tm.base_opcode | 0xf) == 0x4f)
+      || ((i.tm.base_opcode | 1) == 0xff
+  && (i.tm.extension_opcode | 1) == 0x1))
+    return (i.types[0].bitfield.class == Reg
+    || i.types[0].bitfield.instance == Accum);
+
+  return 0;
+}
+
+/* Return 1 if a FUSED_JCC_PADDING frag should be generated.  */
+
+static int
+add_fused_jcc_padding_frag_p (void)
+{
+  /* NB: Don't work with COND_JUMP86 without i386.  */
+  if (!align_branch_power
+      || now_seg == absolute_section
+      || !cpu_arch_flags.bitfield.cpui386
+      || !(align_branch & align_branch_fused_bit))
+    return 0;
+
+  if (maybe_fused_with_jcc_p ())
+    {
+      if (last_insn.kind == last_insn_other
+  || last_insn.seg != now_seg)
+ return 1;
+      if (flag_debug)
+ as_warn_where (last_insn.file, last_insn.line,
+       _("`%s` skips -malign-branch-boundary on `%s`"),
+       last_insn.name, i.tm.name);
+    }
+
+  return 0;
+}
+
+/* Return 1 if a BRANCH_PREFIX frag should be generated.  */
+
+static int
+add_branch_prefix_frag_p (void)
+{
+  /* NB: Don't work with COND_JUMP86 without i386.  Don't add prefix
+     to PadLock instructions since they include prefixes in opcode.  */
+  if (!align_branch_power
+      || !align_branch_prefix_size
+      || now_seg == absolute_section
+      || i.tm.cpu_flags.bitfield.cpupadlock
+      || !cpu_arch_flags.bitfield.cpui386)
+    return 0;
+
+  /* Don't add prefix if it is a prefix or there is no operand in case
+     that segment prefix is special.  */
+  if (!i.operands || i.tm.opcode_modifier.isprefix)
+    return 0;
+
+  if (last_insn.kind == last_insn_other
+      || last_insn.seg != now_seg)
+    return 1;
+
+  if (flag_debug)
+    as_warn_where (last_insn.file, last_insn.line,
+   _("`%s` skips -malign-branch-boundary on `%s`"),
+   last_insn.name, i.tm.name);
+
+  return 0;
+}
+
+/* Return 1 if a BRANCH_PADDING frag should be generated.  */
+
+static int
+add_branch_padding_frag_p (enum align_branch_kind *branch_p)
+{
+  int add_padding;
+
+  /* NB: Don't work with COND_JUMP86 without i386.  */
+  if (!align_branch_power
+      || now_seg == absolute_section
+      || !cpu_arch_flags.bitfield.cpui386)
+    return 0;
+
+  add_padding = 0;
+
+  /* Check for jcc and direct jmp.  */
+  if (i.tm.opcode_modifier.jump == JUMP)
+    {
+      if (i.tm.base_opcode == JUMP_PC_RELATIVE)
+ {
+  *branch_p = align_branch_jmp;
+  add_padding = align_branch & align_branch_jmp_bit;
+ }
+      else
+ {
+  *branch_p = align_branch_jcc;
+  if ((align_branch & align_branch_jcc_bit))
+    add_padding = 1;
+ }
+    }
+  else if (is_any_vex_encoding (&i.tm))
+    return 0;
+  else if ((i.tm.base_opcode | 1) == 0xc3)
+    {
+      /* Near ret.  */
+      *branch_p = align_branch_ret;
+      if ((align_branch & align_branch_ret_bit))
+ add_padding = 1;
+    }
+  else
+    {
+      /* Check for indirect jmp, direct and indirect calls.  */
+      if (i.tm.base_opcode == 0xe8)
+ {
+  /* Direct call.  */
+  *branch_p = align_branch_call;
+  if ((align_branch & align_branch_call_bit))
+    add_padding = 1;
+ }
+      else if (i.tm.base_opcode == 0xff
+       && (i.tm.extension_opcode == 2
+   || i.tm.extension_opcode == 4))
+ {
+  /* Indirect call and jmp.  */
+  *branch_p = align_branch_indirect;
+  if ((align_branch & align_branch_indirect_bit))
+    add_padding = 1;
+ }
+
+      if (add_padding
+  && i.disp_operands
+  && tls_get_addr
+  && (i.op[0].disps->X_op == O_symbol
+      || (i.op[0].disps->X_op == O_subtract
+  && i.op[0].disps->X_op_symbol == GOT_symbol)))
+ {
+  symbolS *s = i.op[0].disps->X_add_symbol;
+  /* No padding to call to global or undefined tls_get_addr.  */
+  if ((S_IS_EXTERNAL (s) || !S_IS_DEFINED (s))
+      && strcmp (S_GET_NAME (s), tls_get_addr) == 0)
+    return 0;
+ }
+    }
+
+  if (add_padding
+      && last_insn.kind != last_insn_other
+      && last_insn.seg == now_seg)
+    {
+      if (flag_debug)
+ as_warn_where (last_insn.file, last_insn.line,
+       _("`%s` skips -malign-branch-boundary on `%s`"),
+       last_insn.name, i.tm.name);
+      return 0;
+    }
+
+  return add_padding;
+}
+
 static void
 output_insn (void)
 {
   fragS *insn_start_frag;
   offsetT insn_start_off;
+  fragS *fragP = NULL;
+  enum align_branch_kind branch = align_branch_none;
 
 #if defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)
   if (IS_ELF && x86_used_note)
@@ -8288,6 +8570,31 @@ output_insn (void)
   insn_start_frag = frag_now;
   insn_start_off = frag_now_fix ();
 
+  if (add_branch_padding_frag_p (&branch))
+    {
+      char *p;
+      /* Branch can be 8 bytes.  Leave some room for prefixes.  */
+      unsigned int max_branch_padding_size = 14;
+
+      /* Align section to boundary.  */
+      record_alignment (now_seg, align_branch_power);
+
+      /* Make room for padding.  */
+      frag_grow (max_branch_padding_size);
+
+      /* Start of the padding.  */
+      p = frag_more (0);
+
+      fragP = frag_now;
+
+      frag_var (rs_machine_dependent, max_branch_padding_size, 0,
+ ENCODE_RELAX_STATE (BRANCH_PADDING, 0),
+ NULL, 0, p);
+
+      fragP->tc_frag_data.branch_type = branch;
+      fragP->tc_frag_data.max_bytes = max_branch_padding_size;
+    }
+
   /* Output jumps.  */
   if (i.tm.opcode_modifier.jump == JUMP)
     output_branch ();
@@ -8326,6 +8633,41 @@ output_insn (void)
   i.prefix[LOCK_PREFIX] = 0;
  }
 
+      if (branch)
+ /* Skip if this is a branch.  */
+ ;
+      else if (add_fused_jcc_padding_frag_p ())
+ {
+  /* Make room for padding.  */
+  frag_grow (MAX_FUSED_JCC_PADDING_SIZE);
+  p = frag_more (0);
+
+  fragP = frag_now;
+
+  frag_var (rs_machine_dependent, MAX_FUSED_JCC_PADDING_SIZE, 0,
+    ENCODE_RELAX_STATE (FUSED_JCC_PADDING, 0),
+    NULL, 0, p);
+
+  fragP->tc_frag_data.branch_type = align_branch_fused;
+  fragP->tc_frag_data.max_bytes = MAX_FUSED_JCC_PADDING_SIZE;
+ }
+      else if (add_branch_prefix_frag_p ())
+ {
+  unsigned int max_prefix_size = align_branch_prefix_size;
+
+  /* Make room for padding.  */
+  frag_grow (max_prefix_size);
+  p = frag_more (0);
+
+  fragP = frag_now;
+
+  frag_var (rs_machine_dependent, max_prefix_size, 0,
+    ENCODE_RELAX_STATE (BRANCH_PREFIX, 0),
+    NULL, 0, p);
+
+  fragP->tc_frag_data.max_bytes = max_prefix_size;
+ }
+
       /* Since the VEX/EVEX prefix contains the implicit prefix, we
  don't need the explicit prefix.  */
       if (!i.tm.opcode_modifier.vex && !i.tm.opcode_modifier.evex)
@@ -8473,9 +8815,105 @@ output_insn (void)
   if (j > 15)
     as_warn (_("instruction length of %u bytes exceeds the limit of 15"),
      j);
+  else if (fragP)
+    {
+      /* NB: Don't add prefix with GOTPC relocation since
+ output_disp() above depends on the fixed encoding
+ length.  Can't add prefix with TLS relocation since
+ it breaks TLS linker optimization.  */
+      unsigned int max = i.has_gotpc_tls_reloc ? 0 : 15 - j;
+      /* Prefix count on the current instruction.  */
+      unsigned int count = i.vex.length;
+      unsigned int k;
+      for (k = 0; k < ARRAY_SIZE (i.prefix); k++)
+ /* REX byte is encoded in VEX/EVEX prefix.  */
+ if (i.prefix[k] && (k != REX_PREFIX || !i.vex.length))
+  count++;
+
+      /* Count SSE prefix.  */
+      if (!i.vex.length)
+ switch (i.tm.opcode_length)
+  {
+  case 3:
+    if (((i.tm.base_opcode >> 16) & 0xff) == 0xf)
+      {
+ count++;
+ switch ((i.tm.base_opcode >> 8) & 0xff)
+  {
+  case 0x38:
+  case 0x3a:
+    count++;
+    break;
+  default:
+    break;
+  }
+      }
+    break;
+  case 2:
+    if (((i.tm.base_opcode >> 8) & 0xff) == 0xf)
+      count++;
+    break;
+  case 1:
+    break;
+  default:
+    abort ();
+  }
+
+      if (TYPE_FROM_RELAX_STATE (fragP->fr_subtype)
+  == BRANCH_PREFIX)
+ {
+  /* Set the maximum prefix size in BRANCH_PREFIX
+     frag.  */
+  if (fragP->tc_frag_data.max_bytes > max)
+    fragP->tc_frag_data.max_bytes = max;
+  if (fragP->tc_frag_data.max_bytes > count)
+    fragP->tc_frag_data.max_bytes -= count;
+  else
+    fragP->tc_frag_data.max_bytes = 0;
+ }
+      else
+ {
+  /* Remember the maximum prefix size in FUSED_JCC_PADDING
+     frag.  */
+  unsigned int max_prefix_size;
+  if (align_branch_prefix_size > max)
+    max_prefix_size = max;
+  else
+    max_prefix_size = align_branch_prefix_size;
+  if (max_prefix_size > count)
+    fragP->tc_frag_data.max_prefix_length
+      = max_prefix_size - count;
+ }
+
+      /* Use existing segment prefix if possible.  Use CS
+ segment prefix in 64-bit mode.  In 32-bit mode, use SS
+ segment prefix with ESP/EBP base register and use DS
+ segment prefix without ESP/EBP base register.  */
+      if (i.prefix[SEG_PREFIX])
+ fragP->tc_frag_data.default_prefix = i.prefix[SEG_PREFIX];
+      else if (flag_code == CODE_64BIT)
+ fragP->tc_frag_data.default_prefix = CS_PREFIX_OPCODE;
+      else if (i.base_reg
+       && (i.base_reg->reg_num == 4
+   || i.base_reg->reg_num == 5))
+ fragP->tc_frag_data.default_prefix = SS_PREFIX_OPCODE;
+      else
+ fragP->tc_frag_data.default_prefix = DS_PREFIX_OPCODE;
+    }
  }
     }
 
+  /* NB: Don't work with COND_JUMP86 without i386.  */
+  if (align_branch_power
+      && now_seg != absolute_section
+      && cpu_arch_flags.bitfield.cpui386)
+    {
+      /* Terminate each frag so that we can add prefix and check for
+         fused jcc.  */
+      frag_wane (frag_now);
+      frag_new (0);
+    }
+
 #ifdef DEBUG386
   if (flag_debug)
     {
@@ -8585,6 +9023,7 @@ output_disp (fragS *insn_start_frag, offsetT insn_start_off)
   if (!object_64bit)
     {
       reloc_type = BFD_RELOC_386_GOTPC;
+      i.has_gotpc_tls_reloc = TRUE;
       i.op[n].imms->X_add_number +=
  encoding_length (insn_start_frag, insn_start_off, p);
     }
@@ -8596,6 +9035,27 @@ output_disp (fragS *insn_start_frag, offsetT insn_start_off)
        insn, and that is taken care of in other code.  */
     reloc_type = BFD_RELOC_X86_64_GOTPC32;
  }
+      else if (align_branch_power)
+ {
+  switch (reloc_type)
+    {
+    case BFD_RELOC_386_TLS_GD:
+    case BFD_RELOC_386_TLS_LDM:
+    case BFD_RELOC_386_TLS_IE:
+    case BFD_RELOC_386_TLS_IE_32:
+    case BFD_RELOC_386_TLS_GOTIE:
+    case BFD_RELOC_386_TLS_GOTDESC:
+    case BFD_RELOC_386_TLS_DESC_CALL:
+    case BFD_RELOC_X86_64_TLSGD:
+    case BFD_RELOC_X86_64_TLSLD:
+    case BFD_RELOC_X86_64_GOTTPOFF:
+    case BFD_RELOC_X86_64_GOTPC32_TLSDESC:
+    case BFD_RELOC_X86_64_TLSDESC_CALL:
+      i.has_gotpc_tls_reloc = TRUE;
+    default:
+      break;
+    }
+ }
       fixP = fix_new_exp (frag_now, p - frag_now->fr_literal,
   size, i.op[n].disps, pcrel,
   reloc_type);
@@ -8737,6 +9197,7 @@ output_imm (fragS *insn_start_frag, offsetT insn_start_off)
     reloc_type = BFD_RELOC_X86_64_GOTPC32;
   else if (size == 8)
     reloc_type = BFD_RELOC_X86_64_GOTPC64;
+  i.has_gotpc_tls_reloc = TRUE;
   i.op[n].imms->X_add_number +=
     encoding_length (insn_start_frag, insn_start_off, p);
  }
@@ -10362,6 +10823,362 @@ elf_symbol_resolved_in_segment_p (symbolS *fr_symbol, offsetT fr_var)
 }
 #endif
 
+/* Return the next non-empty frag.  */
+
+static fragS *
+i386_next_non_empty_frag (fragS *fragP)
+{
+  /* There may be a frag with a ".fill 0" when there is no room in
+     the current frag for frag_grow in output_insn.  */
+  for (fragP = fragP->fr_next;
+       (fragP != NULL
+ && fragP->fr_type == rs_fill
+ && fragP->fr_fix == 0);
+       fragP = fragP->fr_next)
+    ;
+  return fragP;
+}
+
+/* Return the next jcc frag after BRANCH_PADDING.  */
+
+static fragS *
+i386_next_jcc_frag (fragS *fragP)
+{
+  if (!fragP)
+    return NULL;
+
+  if (fragP->fr_type == rs_machine_dependent
+      && (TYPE_FROM_RELAX_STATE (fragP->fr_subtype)
+  == BRANCH_PADDING))
+    {
+      fragP = i386_next_non_empty_frag (fragP);
+      if (fragP->fr_type != rs_machine_dependent)
+ return NULL;
+      if (TYPE_FROM_RELAX_STATE (fragP->fr_subtype) == COND_JUMP)
+ return fragP;
+    }
+
+  return NULL;
+}
+
+/* Classify BRANCH_PADDING, BRANCH_PREFIX and FUSED_JCC_PADDING frags.  */
+
+static void
+i386_classify_machine_dependent_frag (fragS *fragP)
+{
+  fragS *cmp_fragP;
+  fragS *pad_fragP;
+  fragS *branch_fragP;
+  fragS *next_fragP;
+  unsigned int max_prefix_length;
+
+  if (fragP->tc_frag_data.classified)
+    return;
+
+  /* First scan for BRANCH_PADDING and FUSED_JCC_PADDING.  Convert
+     FUSED_JCC_PADDING and merge BRANCH_PADDING.  */
+  for (next_fragP = fragP;
+       next_fragP != NULL;
+       next_fragP = next_fragP->fr_next)
+    {
+      next_fragP->tc_frag_data.classified = 1;
+      if (next_fragP->fr_type == rs_machine_dependent)
+ switch (TYPE_FROM_RELAX_STATE (next_fragP->fr_subtype))
+  {
+  case BRANCH_PADDING:
+    /* The BRANCH_PADDING frag must be followed by a branch
+       frag.  */
+    branch_fragP = i386_next_non_empty_frag (next_fragP);
+    next_fragP->tc_frag_data.u.branch_fragP = branch_fragP;
+    break;
+  case FUSED_JCC_PADDING:
+    /* Check if this is a fused jcc:
+       FUSED_JCC_PADDING
+       CMP like instruction
+       BRANCH_PADDING
+       COND_JUMP
+       */
+    cmp_fragP = i386_next_non_empty_frag (next_fragP);
+    pad_fragP = i386_next_non_empty_frag (cmp_fragP);
+    branch_fragP = i386_next_jcc_frag (pad_fragP);
+    if (branch_fragP)
+      {
+ /* The BRANCH_PADDING frag is merged with the
+   FUSED_JCC_PADDING frag.  */
+ next_fragP->tc_frag_data.u.branch_fragP = branch_fragP;
+ /* CMP like instruction size.  */
+ next_fragP->tc_frag_data.cmp_size = cmp_fragP->fr_fix;
+ frag_wane (pad_fragP);
+ /* Skip to branch_fragP.  */
+ next_fragP = branch_fragP;
+      }
+    else if (next_fragP->tc_frag_data.max_prefix_length)
+      {
+ /* Turn FUSED_JCC_PADDING into BRANCH_PREFIX if it isn't
+   a fused jcc.  */
+ next_fragP->fr_subtype
+  = ENCODE_RELAX_STATE (BRANCH_PREFIX, 0);
+ next_fragP->tc_frag_data.max_bytes
+  = next_fragP->tc_frag_data.max_prefix_length;
+ /* This will be updated in the BRANCH_PREFIX scan.  */
+ next_fragP->tc_frag_data.max_prefix_length = 0;
+      }
+    else
+      frag_wane (next_fragP);
+    break;
+  }
+    }
+
+  /* Stop if there is no BRANCH_PREFIX.  */
+  if (!align_branch_prefix_size)
+    return;
+
+  /* Scan for BRANCH_PREFIX.  */
+  for (; fragP != NULL; fragP = fragP->fr_next)
+    {
+      if (fragP->fr_type != rs_machine_dependent
+  || (TYPE_FROM_RELAX_STATE (fragP->fr_subtype)
+      != BRANCH_PREFIX))
+ continue;
+
+      /* Count all BRANCH_PREFIX frags before BRANCH_PADDING and
+ COND_JUMP_PREFIX.  */
+      max_prefix_length = 0;
+      for (next_fragP = fragP;
+   next_fragP != NULL;
+   next_fragP = next_fragP->fr_next)
+ {
+  if (next_fragP->fr_type == rs_fill)
+    /* Skip rs_fill frags.  */
+    continue;
+  else if (next_fragP->fr_type != rs_machine_dependent)
+    /* Stop for all other frags.  */
+    break;
+
+  /* rs_machine_dependent frags.  */
+  if (TYPE_FROM_RELAX_STATE (next_fragP->fr_subtype)
+      == BRANCH_PREFIX)
+    {
+      /* Count BRANCH_PREFIX frags.  */
+      if (max_prefix_length >= MAX_FUSED_JCC_PADDING_SIZE)
+ {
+  max_prefix_length = MAX_FUSED_JCC_PADDING_SIZE;
+  frag_wane (next_fragP);
+ }
+      else
+ max_prefix_length
+  += next_fragP->tc_frag_data.max_bytes;
+    }
+  else if ((TYPE_FROM_RELAX_STATE (next_fragP->fr_subtype)
+    == BRANCH_PADDING)
+   || (TYPE_FROM_RELAX_STATE (next_fragP->fr_subtype)
+       == FUSED_JCC_PADDING))
+    {
+      /* Stop at BRANCH_PADDING and FUSED_JCC_PADDING.  */
+      fragP->tc_frag_data.u.padding_fragP = next_fragP;
+      break;
+    }
+  else
+    /* Stop for other rs_machine_dependent frags.  */
+    break;
+ }
+
+      fragP->tc_frag_data.max_prefix_length = max_prefix_length;
+
+      /* Skip to the next frag.  */
+      fragP = next_fragP;
+    }
+}
+
+/* Compute padding size for
+
+ FUSED_JCC_PADDING
+ CMP like instruction
+ BRANCH_PADDING
+ COND_JUMP/UNCOND_JUMP
+
+   or
+
+ BRANCH_PADDING
+ COND_JUMP/UNCOND_JUMP
+ */
+
+static int
+i386_branch_padding_size (fragS *fragP, offsetT address)
+{
+  unsigned int offset, size, padding_size;
+  fragS *branch_fragP = fragP->tc_frag_data.u.branch_fragP;
+
+  /* The start address of the BRANCH_PADDING or FUSED_JCC_PADDING frag.  */
+  if (!address)
+    address = fragP->fr_address;
+  address += fragP->fr_fix;
+
+  /* CMP like instrunction size.  */
+  size = fragP->tc_frag_data.cmp_size;
+
+  /* The base size of the branch frag.  */
+  size += branch_fragP->fr_fix;
+
+  /* Add opcode and displacement bytes for the rs_machine_dependent
+     branch frag.  */
+  if (branch_fragP->fr_type == rs_machine_dependent)
+    size += md_relax_table[branch_fragP->fr_subtype].rlx_length;
+
+  /* Check if branch is within boundary and doesn't end at the last
+     byte.  */
+  offset = address & ((1U << align_branch_power) - 1);
+  if ((offset + size) >= (1U << align_branch_power))
+    /* Padding needed to avoid crossing boundary.  */
+    padding_size = (1U << align_branch_power) - offset;
+  else
+    /* No padding needed.  */
+    padding_size = 0;
+
+  /* The return value may be saved in tc_frag_data.length which is
+     unsigned byte.  */
+  if (!fits_in_unsigned_byte (padding_size))
+    abort ();
+
+  return padding_size;
+}
+
+/* i386_generic_table_relax_frag()
+
+   Handle BRANCH_PADDING, BRANCH_PREFIX and FUSED_JCC_PADDING frags to
+   grow/shrink padding to align branch frags.  Hand others to
+   relax_frag().  */
+
+long
+i386_generic_table_relax_frag (segT segment, fragS *fragP, long stretch)
+{
+  if (TYPE_FROM_RELAX_STATE (fragP->fr_subtype) == BRANCH_PADDING
+      || TYPE_FROM_RELAX_STATE (fragP->fr_subtype) == FUSED_JCC_PADDING)
+    {
+      long padding_size = i386_branch_padding_size (fragP, 0);
+      long grow = padding_size - fragP->tc_frag_data.length;
+
+      /* When the BRANCH_PREFIX frag is used, the computed address
+         must match the actual address and there should be no padding.  */
+      if (fragP->tc_frag_data.padding_address
+  && (fragP->tc_frag_data.padding_address != fragP->fr_address
+      || padding_size))
+ abort ();
+
+      /* Update the padding size.  */
+      if (grow)
+ fragP->tc_frag_data.length = padding_size;
+
+      return grow;
+    }
+  else if (TYPE_FROM_RELAX_STATE (fragP->fr_subtype) == BRANCH_PREFIX)
+    {
+      fragS *padding_fragP, *next_fragP;
+      long padding_size, left_size, last_size;
+
+      padding_fragP = fragP->tc_frag_data.u.padding_fragP;
+      if (!padding_fragP)
+ /* Use the padding set by the leading BRANCH_PREFIX frag.  */
+ return (fragP->tc_frag_data.length
+ - fragP->tc_frag_data.last_length);
+
+      /* Compute the relative address of the padding frag in the very
+        first time where the BRANCH_PREFIX frag sizes are zero.  */
+      if (!fragP->tc_frag_data.padding_address)
+ fragP->tc_frag_data.padding_address
+  = padding_fragP->fr_address - (fragP->fr_address - stretch);
+
+      /* First update the last length from the previous interation.  */
+      left_size = fragP->tc_frag_data.prefix_length;
+      for (next_fragP = fragP;
+   next_fragP != padding_fragP;
+   next_fragP = next_fragP->fr_next)
+ if (next_fragP->fr_type == rs_machine_dependent
+    && (TYPE_FROM_RELAX_STATE (next_fragP->fr_subtype)
+ == BRANCH_PREFIX))
+  {
+    if (left_size)
+      {
+ int max = next_fragP->tc_frag_data.max_bytes;
+ if (max)
+  {
+    int size;
+    if (max > left_size)
+      size = left_size;
+    else
+      size = max;
+    left_size -= size;
+    next_fragP->tc_frag_data.last_length = size;
+  }
+      }
+    else
+      next_fragP->tc_frag_data.last_length = 0;
+  }
+
+      /* Check the padding size for the padding frag.  */
+      padding_size = i386_branch_padding_size
+ (padding_fragP, (fragP->fr_address
+ + fragP->tc_frag_data.padding_address));
+
+      last_size = fragP->tc_frag_data.prefix_length;
+      /* Check if there is change from the last interation.  */
+      if (padding_size == last_size)
+ {
+  /* Update the expected address of the padding frag.  */
+  padding_fragP->tc_frag_data.padding_address
+    = (fragP->fr_address + padding_size
+       + fragP->tc_frag_data.padding_address);
+  return 0;
+ }
+
+      if (padding_size > fragP->tc_frag_data.max_prefix_length)
+ {
+  /* No padding if there is no sufficient room.  Clear the
+     expected address of the padding frag.  */
+  padding_fragP->tc_frag_data.padding_address = 0;
+  padding_size = 0;
+ }
+      else
+ /* Store the expected address of the padding frag.  */
+ padding_fragP->tc_frag_data.padding_address
+  = (fragP->fr_address + padding_size
+     + fragP->tc_frag_data.padding_address);
+
+      fragP->tc_frag_data.prefix_length = padding_size;
+
+      /* Update the length for the current interation.  */
+      left_size = padding_size;
+      for (next_fragP = fragP;
+   next_fragP != padding_fragP;
+   next_fragP = next_fragP->fr_next)
+ if (next_fragP->fr_type == rs_machine_dependent
+    && (TYPE_FROM_RELAX_STATE (next_fragP->fr_subtype)
+ == BRANCH_PREFIX))
+  {
+    if (left_size)
+      {
+ int max = next_fragP->tc_frag_data.max_bytes;
+ if (max)
+  {
+    int size;
+    if (max > left_size)
+      size = left_size;
+    else
+      size = max;
+    left_size -= size;
+    next_fragP->tc_frag_data.length = size;
+  }
+      }
+    else
+      next_fragP->tc_frag_data.length = 0;
+  }
+
+      return (fragP->tc_frag_data.length
+      - fragP->tc_frag_data.last_length);
+    }
+  return relax_frag (segment, fragP, stretch);
+}
+
 /* md_estimate_size_before_relax()
 
    Called just before relax() for rs_machine_dependent frags.  The x86
@@ -10378,6 +11195,14 @@ elf_symbol_resolved_in_segment_p (symbolS *fr_symbol, offsetT fr_var)
 int
 md_estimate_size_before_relax (fragS *fragP, segT segment)
 {
+  if (TYPE_FROM_RELAX_STATE (fragP->fr_subtype) == BRANCH_PADDING
+      || TYPE_FROM_RELAX_STATE (fragP->fr_subtype) == BRANCH_PREFIX
+      || TYPE_FROM_RELAX_STATE (fragP->fr_subtype) == FUSED_JCC_PADDING)
+    {
+      i386_classify_machine_dependent_frag (fragP);
+      return fragP->tc_frag_data.length;
+    }
+
   /* We've already got fragP->fr_subtype right;  all we have to do is
      check for un-relaxable symbols.  On an ELF system, we can't relax
      an externally visible symbol, because it may be overridden by a
@@ -10511,6 +11336,106 @@ md_convert_frag (bfd *abfd ATTRIBUTE_UNUSED, segT sec ATTRIBUTE_UNUSED,
   unsigned int extension = 0;
   offsetT displacement_from_opcode_start;
 
+  if (TYPE_FROM_RELAX_STATE (fragP->fr_subtype) == BRANCH_PADDING
+      || TYPE_FROM_RELAX_STATE (fragP->fr_subtype) == FUSED_JCC_PADDING
+      || TYPE_FROM_RELAX_STATE (fragP->fr_subtype) == BRANCH_PREFIX)
+    {
+      /* Generate nop padding.  */
+      unsigned int size = fragP->tc_frag_data.length;
+      if (size)
+ {
+  if (size > fragP->tc_frag_data.max_bytes)
+    abort ();
+
+  if (flag_debug)
+    {
+      const char *msg;
+      const char *branch = "branch";
+      const char *prefix = "";
+      fragS *padding_fragP;
+      if (TYPE_FROM_RELAX_STATE (fragP->fr_subtype)
+  == BRANCH_PREFIX)
+ {
+  padding_fragP = fragP->tc_frag_data.u.padding_fragP;
+  switch (fragP->tc_frag_data.default_prefix)
+    {
+    default:
+      abort ();
+      break;
+    case CS_PREFIX_OPCODE:
+      prefix = " cs";
+      break;
+    case DS_PREFIX_OPCODE:
+      prefix = " ds";
+      break;
+    case ES_PREFIX_OPCODE:
+      prefix = " es";
+      break;
+    case FS_PREFIX_OPCODE:
+      prefix = " fs";
+      break;
+    case GS_PREFIX_OPCODE:
+      prefix = " gs";
+      break;
+    case SS_PREFIX_OPCODE:
+      prefix = " ss";
+      break;
+    }
+  if (padding_fragP)
+    msg = _("%s:%u: add %d%s at 0x%llx to align "
+    "%s within %d-byte boundary\n");
+  else
+    msg = _("%s:%u: add additional %d%s at 0x%llx to "
+    "align %s within %d-byte boundary\n");
+ }
+      else
+ {
+  padding_fragP = fragP;
+  msg = _("%s:%u: add %d%s-byte nop at 0x%llx to align "
+  "%s within %d-byte boundary\n");
+ }
+
+      if (padding_fragP)
+ switch (padding_fragP->tc_frag_data.branch_type)
+  {
+  case align_branch_jcc:
+    branch = "jcc";
+    break;
+  case align_branch_fused:
+    branch = "fused jcc";
+    break;
+  case align_branch_jmp:
+    branch = "jmp";
+    break;
+  case align_branch_call:
+    branch = "call";
+    break;
+  case align_branch_indirect:
+    branch = "indiret branch";
+    break;
+  case align_branch_ret:
+    branch = "ret";
+    break;
+  default:
+    break;
+  }
+
+      fprintf (stdout, msg,
+       fragP->fr_file, fragP->fr_line, size, prefix,
+       (long long) fragP->fr_address, branch,
+       1 << align_branch_power);
+    }
+  if (TYPE_FROM_RELAX_STATE (fragP->fr_subtype) == BRANCH_PREFIX)
+    memset (fragP->fr_opcode,
+    fragP->tc_frag_data.default_prefix, size);
+  else
+    i386_generate_nops (fragP, (char *) fragP->fr_opcode,
+ size, 0);
+  fragP->fr_fix += size;
+ }
+      return;
+    }
+
   opcode = (unsigned char *) fragP->fr_opcode;
 
   /* Address we want to reach in file space.  */
@@ -11069,6 +11994,9 @@ const char *md_shortopts = "qnO::";
 #define OPTION_MFENCE_AS_LOCK_ADD (OPTION_MD_BASE + 24)
 #define OPTION_X86_USED_NOTE (OPTION_MD_BASE + 25)
 #define OPTION_MVEXWIG (OPTION_MD_BASE + 26)
+#define OPTION_MALIGN_BRANCH_BOUNDARY (OPTION_MD_BASE + 27)
+#define OPTION_MALIGN_BRANCH_PREFIX_SIZE (OPTION_MD_BASE + 28)
+#define OPTION_MALIGN_BRANCH (OPTION_MD_BASE + 29)
 
 struct option md_longopts[] =
 {
@@ -11104,6 +12032,9 @@ struct option md_longopts[] =
   {"mfence-as-lock-add", required_argument, NULL, OPTION_MFENCE_AS_LOCK_ADD},
   {"mrelax-relocations", required_argument, NULL, OPTION_MRELAX_RELOCATIONS},
   {"mevexrcig", required_argument, NULL, OPTION_MEVEXRCIG},
+  {"malign-branch-boundary", required_argument, NULL, OPTION_MALIGN_BRANCH_BOUNDARY},
+  {"malign-branch-prefix-size", required_argument, NULL, OPTION_MALIGN_BRANCH_PREFIX_SIZE},
+  {"malign-branch", required_argument, NULL, OPTION_MALIGN_BRANCH},
   {"mamd64", no_argument, NULL, OPTION_MAMD64},
   {"mintel64", no_argument, NULL, OPTION_MINTEL64},
   {NULL, no_argument, NULL, 0}
@@ -11114,7 +12045,7 @@ int
 md_parse_option (int c, const char *arg)
 {
   unsigned int j;
-  char *arch, *next, *saved;
+  char *arch, *next, *saved, *type;
 
   switch (c)
     {
@@ -11492,6 +12423,80 @@ md_parse_option (int c, const char *arg)
         as_fatal (_("invalid -mrelax-relocations= option: `%s'"), arg);
       break;
 
+    case OPTION_MALIGN_BRANCH_BOUNDARY:
+      {
+ char *end;
+ long int align = strtoul (arg, &end, 0);
+ if (*end == '\0')
+  {
+    if (align == 0)
+      {
+ align_branch_power = 0;
+ break;
+      }
+    else if (align >= 16)
+      {
+ int align_power;
+ for (align_power = 0;
+     (align & 1) == 0;
+     align >>= 1, align_power++)
+  continue;
+ /* Limit alignment power to 31.  */
+ if (align == 1 && align_power < 32)
+  {
+    align_branch_power = align_power;
+    break;
+  }
+      }
+  }
+ as_fatal (_("invalid -malign-branch-boundary= value: %s"), arg);
+      }
+      break;
+
+    case OPTION_MALIGN_BRANCH_PREFIX_SIZE:
+      {
+ char *end;
+ int align = strtoul (arg, &end, 0);
+ /* Some processors only support 5 prefixes.  */
+ if (*end == '\0' && align >= 0 && align < 6)
+  {
+    align_branch_prefix_size = align;
+    break;
+  }
+ as_fatal (_("invalid -malign-branch-prefix-size= value: %s"),
+  arg);
+      }
+      break;
+
+    case OPTION_MALIGN_BRANCH:
+      align_branch = 0;
+      saved = xstrdup (arg);
+      type = saved;
+      do
+ {
+  next = strchr (type, '+');
+  if (next)
+    *next++ = '\0';
+  if (strcasecmp (type, "jcc") == 0)
+    align_branch |= align_branch_jcc_bit;
+  else if (strcasecmp (type, "fused") == 0)
+    align_branch |= align_branch_fused_bit;
+  else if (strcasecmp (type, "jmp") == 0)
+    align_branch |= align_branch_jmp_bit;
+  else if (strcasecmp (type, "call") == 0)
+    align_branch |= align_branch_call_bit;
+  else if (strcasecmp (type, "ret") == 0)
+    align_branch |= align_branch_ret_bit;
+  else if (strcasecmp (type, "indirect") == 0)
+    align_branch |= align_branch_indirect_bit;
+  else
+    as_fatal (_("invalid -malign-branch= option: `%s'"), arg);
+  type = next;
+ }
+      while (next != NULL);
+      free (saved);
+      break;
+
     case OPTION_MAMD64:
       intel64 = 0;
       break;
@@ -11744,6 +12749,17 @@ md_show_usage (FILE *stream)
   fprintf (stream, _("\
                           generate relax relocations\n"));
   fprintf (stream, _("\
+  -malign-branch-boundary=NUM (default: 0)\n\
+                          align branches within NUM byte boundary\n"));
+  fprintf (stream, _("\
+  -malign-branch=TYPE[+TYPE...] (default: jcc+fused+jmp)\n\
+                          TYPE is combination of jcc, fused, jmp, call, ret,\n\
+                           indirect\n\
+                          specify types of branches to align\n"));
+  fprintf (stream, _("\
+  -malign-branch-prefix-size=NUM (default: 5)\n\
+                          align branches with NUM prefixes per instruction\n"));
+  fprintf (stream, _("\
   -mamd64                 accept only AMD64 ISA [default]\n"));
   fprintf (stream, _("\
   -mintel64               accept only Intel64 ISA\n"));
@@ -11827,15 +12843,24 @@ i386_target_format (void)
   {
   default:
     format = ELF_TARGET_FORMAT;
+#ifndef TE_SOLARIS
+    tls_get_addr = "___tls_get_addr";
+#endif
     break;
   case X86_64_ABI:
     use_rela_relocations = 1;
     object_64bit = 1;
+#ifndef TE_SOLARIS
+    tls_get_addr = "__tls_get_addr";
+#endif
     format = ELF_TARGET_FORMAT64;
     break;
   case X86_64_X32_ABI:
     use_rela_relocations = 1;
     object_64bit = 1;
+#ifndef TE_SOLARIS
+    tls_get_addr = "__tls_get_addr";
+#endif
     disallow_64bit_reloc = 1;
     format = ELF_TARGET_FORMAT32;
     break;
@@ -11952,6 +12977,21 @@ s_bss (int ignore ATTRIBUTE_UNUSED)
 
 #endif
 
+/* Remember constant directive.  */
+
+void
+i386_cons_align (int ignore ATTRIBUTE_UNUSED)
+{
+  if (last_insn.kind != last_insn_directive
+      && (bfd_section_flags (now_seg) & SEC_CODE))
+    {
+      last_insn.seg = now_seg;
+      last_insn.kind = last_insn_directive;
+      last_insn.name = "constant directive";
+      last_insn.file = as_where (&last_insn.line);
+    }
+}
+
 void
 i386_validate_fix (fixS *fixp)
 {
diff --git a/gas/config/tc-i386.h b/gas/config/tc-i386.h
index b02a25671f..28f03cf2db 100644
--- a/gas/config/tc-i386.h
+++ b/gas/config/tc-i386.h
@@ -210,12 +210,19 @@ if ((n) \
 
 #define MAX_MEM_FOR_RS_ALIGN_CODE  (alignment ? ((1 << alignment) - 1) : 1)
 
+extern void i386_cons_align (int);
+#define md_cons_align(nbytes) i386_cons_align (nbytes)
+
 void i386_print_statistics (FILE *);
 #define tc_print_statistics i386_print_statistics
 
 extern unsigned int i386_frag_max_var (fragS *);
 #define md_frag_max_var i386_frag_max_var
 
+extern long i386_generic_table_relax_frag (segT, fragS *, long);
+#define md_generic_table_relax_frag(segment, fragP, stretch) \
+  i386_generic_table_relax_frag (segment, fragP, stretch)
+
 #define md_number_to_chars number_to_chars_littleendian
 
 enum processor_type
@@ -250,10 +257,24 @@ extern i386_cpu_flags cpu_arch_isa_flags;
 
 struct i386_tc_frag_data
 {
+  union
+    {
+      fragS *padding_fragP;
+      fragS *branch_fragP;
+    } u;
+  addressT padding_address;
   enum processor_type isa;
   i386_cpu_flags isa_flags;
   enum processor_type tune;
   unsigned int max_bytes;
+  unsigned char length;
+  unsigned char last_length;
+  unsigned char max_prefix_length;
+  unsigned char prefix_length;
+  unsigned char default_prefix;
+  unsigned char cmp_size;
+  unsigned int classified : 1;
+  unsigned int branch_type : 3;
 };
 
 /* We need to emit the right NOP pattern in .align frags.  This is
@@ -264,10 +285,20 @@ struct i386_tc_frag_data
 #define TC_FRAG_INIT(FRAGP, MAX_BYTES) \
  do \
    { \
+     (FRAGP)->tc_frag_data.u.padding_fragP = NULL; \
+     (FRAGP)->tc_frag_data.padding_address = 0; \
      (FRAGP)->tc_frag_data.isa = cpu_arch_isa; \
      (FRAGP)->tc_frag_data.isa_flags = cpu_arch_isa_flags; \
      (FRAGP)->tc_frag_data.tune = cpu_arch_tune; \
      (FRAGP)->tc_frag_data.max_bytes = (MAX_BYTES); \
+     (FRAGP)->tc_frag_data.length = 0; \
+     (FRAGP)->tc_frag_data.last_length = 0; \
+     (FRAGP)->tc_frag_data.max_prefix_length = 0; \
+     (FRAGP)->tc_frag_data.prefix_length = 0; \
+     (FRAGP)->tc_frag_data.default_prefix = 0; \
+     (FRAGP)->tc_frag_data.cmp_size = 0; \
+     (FRAGP)->tc_frag_data.classified = 0; \
+     (FRAGP)->tc_frag_data.branch_type = 0; \
    } \
  while (0)
 
diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi
index 589b4260f0..74296e61f6 100644
--- a/gas/doc/c-i386.texi
+++ b/gas/doc/c-i386.texi
@@ -423,6 +423,32 @@ R_X86_64_REX_GOTPCRELX, in 64-bit mode.
 relocations.  The default can be controlled by a configure option
 @option{--enable-x86-relax-relocations}.
 
+@cindex @samp{-malign-branch-boundary=} option, i386
+@cindex @samp{-malign-branch-boundary=} option, x86-64
+@item -malign-branch-boundary=@var{NUM}
+This option controls how the assembler should align branches with segment
+prefixes or NOP.  @var{NUM} must be a power of 2.  It should be 0 or
+no less than 16.  Branches will be aligned within @var{NUM} byte
+boundary.  @option{-malign-branch-boundary=0}, which is the default,
+doesn't align branches.
+
+@cindex @samp{-malign-branch=} option, i386
+@cindex @samp{-malign-branch=} option, x86-64
+@item -malign-branch=@var{TYPE}[+@var{TYPE}...]
+This option specifies types of branches to align. @var{TYPE} is
+combination of @samp{jcc}, which aligns conditional jumps,
+@samp{fused}, which aligns fused conditional jumps, @samp{jmp},
+which aligns unconditional jumps, @samp{call} which aligns calls,
+@samp{ret}, which aligns rets, @samp{indirect}, which aligns indirect
+jumps and calls.  The default is @option{-malign-branch=jcc+fused+jmp}.
+
+@cindex @samp{-malign-branch-prefix-size=} option, i386
+@cindex @samp{-malign-branch-prefix-size=} option, x86-64
+@item -malign-branch-prefix-size=@var{NUM}
+This option specifies the maximum number of prefixes on an instruction
+to align branches.  @var{NUM} should be between 0 and 5.  The default
+@var{NUM} is 5.
+
 @cindex @samp{-mx86-used-note=} option, i386
 @cindex @samp{-mx86-used-note=} option, x86-64
 @item -mx86-used-note=@var{no}
--
2.21.0

Reply | Threaded
Open this post in threaded view
|

V3 [PATCH 3/4] i386: Add -mbranches-within-32B-boundaries

H.J. Lu-30
In reply to this post by H.J. Lu-30
Add -mbranches-within-32B-boundaries to enable

-malign-branch-boundary=32
-malign-branch=jcc+fused+jmp
-malign-branch-prefix-size=5

        * config/tc-i386.c (OPTION_MBRANCHES_WITH_32B_BOUNDARIES): New.
        (md_longopts): Add -mbranches-within-32B-boundaries.
        (md_parse_option): Handle -mbranches-within-32B-boundaries.
        (md_show_usage): Add -mbranches-within-32B-boundaries.
---
 gas/config/tc-i386.c | 13 +++++++++++++
 gas/doc/c-i386.texi  | 11 +++++++++++
 2 files changed, 24 insertions(+)

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index 0ab6651f24..cbd718cca7 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -11997,6 +11997,7 @@ const char *md_shortopts = "qnO::";
 #define OPTION_MALIGN_BRANCH_BOUNDARY (OPTION_MD_BASE + 27)
 #define OPTION_MALIGN_BRANCH_PREFIX_SIZE (OPTION_MD_BASE + 28)
 #define OPTION_MALIGN_BRANCH (OPTION_MD_BASE + 29)
+#define OPTION_MBRANCHES_WITH_32B_BOUNDARIES (OPTION_MD_BASE + 30)
 
 struct option md_longopts[] =
 {
@@ -12035,6 +12036,7 @@ struct option md_longopts[] =
   {"malign-branch-boundary", required_argument, NULL, OPTION_MALIGN_BRANCH_BOUNDARY},
   {"malign-branch-prefix-size", required_argument, NULL, OPTION_MALIGN_BRANCH_PREFIX_SIZE},
   {"malign-branch", required_argument, NULL, OPTION_MALIGN_BRANCH},
+  {"mbranches-within-32B-boundaries", no_argument, NULL, OPTION_MBRANCHES_WITH_32B_BOUNDARIES},
   {"mamd64", no_argument, NULL, OPTION_MAMD64},
   {"mintel64", no_argument, NULL, OPTION_MINTEL64},
   {NULL, no_argument, NULL, 0}
@@ -12497,6 +12499,14 @@ md_parse_option (int c, const char *arg)
       free (saved);
       break;
 
+    case OPTION_MBRANCHES_WITH_32B_BOUNDARIES:
+      align_branch_power = 5;
+      align_branch_prefix_size = 5;
+      align_branch = (align_branch_jcc_bit
+      | align_branch_fused_bit
+      | align_branch_jmp_bit);
+      break;
+
     case OPTION_MAMD64:
       intel64 = 0;
       break;
@@ -12760,6 +12770,9 @@ md_show_usage (FILE *stream)
   -malign-branch-prefix-size=NUM (default: 5)\n\
                           align branches with NUM prefixes per instruction\n"));
   fprintf (stream, _("\
+  -mbranches-within-32B-boundaries\n\
+                          align branches within 32 byte boundary\n"));
+  fprintf (stream, _("\
   -mamd64                 accept only AMD64 ISA [default]\n"));
   fprintf (stream, _("\
   -mintel64               accept only Intel64 ISA\n"));
diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi
index 74296e61f6..08f139cc15 100644
--- a/gas/doc/c-i386.texi
+++ b/gas/doc/c-i386.texi
@@ -449,6 +449,17 @@ This option specifies the maximum number of prefixes on an instruction
 to align branches.  @var{NUM} should be between 0 and 5.  The default
 @var{NUM} is 5.
 
+@cindex @samp{-mbranches-within-32B-boundaries} option, i386
+@cindex @samp{-mbranches-within-32B-boundaries} option, x86-64
+@item -mbranches-within-32B-boundaries
+This option aligns conditional jumps, fused conditional jumps and
+unconditional jumps within 32 byte boundary with up to 5 segment prefixes
+on an instruction.  It is equivalent to
+@option{-malign-branch-boundary=32}
+@option{-malign-branch=jcc+fused+jmp}
+@option{-malign-branch-prefix-size=5}.
+The default doesn't align branches.
+
 @cindex @samp{-mx86-used-note=} option, i386
 @cindex @samp{-mx86-used-note=} option, x86-64
 @item -mx86-used-note=@var{no}
--
2.21.0

Reply | Threaded
Open this post in threaded view
|

V3 [PATCH 4/4] i386: Add tests for -malign-branch-boundary and -malign-branch

H.J. Lu-30
In reply to this post by H.J. Lu-30
Add tests for -malign-branch-boundary, -malign-branch and
-mbranches-within-32B-boundaries.

gas/

        * testsuite/gas/i386/align-branch-1.s: New file.
        * testsuite/gas/i386/align-branch-1a.d: Likewise.
        * testsuite/gas/i386/align-branch-1b.d: Likewise.
        * testsuite/gas/i386/align-branch-1c.d: Likewise.
        * testsuite/gas/i386/align-branch-1d.d: Likewise.
        * testsuite/gas/i386/align-branch-1e.d: Likewise.
        * testsuite/gas/i386/align-branch-1f.d: Likewise.
        * testsuite/gas/i386/align-branch-1g.d: Likewise.
        * testsuite/gas/i386/align-branch-1h.d: Likewise.
        * testsuite/gas/i386/align-branch-2.s: Likewise.
        * testsuite/gas/i386/align-branch-2a.d: Likewise.
        * testsuite/gas/i386/align-branch-2b.d: Likewise.
        * testsuite/gas/i386/align-branch-2c.d: Likewise.
        * testsuite/gas/i386/align-branch-3.d: Likewise.
        * testsuite/gas/i386/align-branch-3.s: Likewise.
        * testsuite/gas/i386/align-branch-4.s: Likewise.
        * testsuite/gas/i386/align-branch-4a.d: Likewise.
        * testsuite/gas/i386/align-branch-4b.d: Likewise.
        * testsuite/gas/i386/align-branch-5.d: Likewise.
        * testsuite/gas/i386/align-branch-5.s: Likewise.
        * testsuite/gas/i386/align-branch-6.d: Likewise.
        * testsuite/gas/i386/align-branch-6.s: Likewise.
        * testsuite/gas/i386/align-branch-7.d: Likewise.
        * testsuite/gas/i386/align-branch-7.s: Likewise.
        * testsuite/gas/i386/align-branch-8.d: Likewise.
        * testsuite/gas/i386/align-branch-8.s: Likewise.
        * testsuite/gas/i386/x86-64-align-branch-1.s: Likewise.
        * testsuite/gas/i386/x86-64-align-branch-1a.d: Likewise.
        * testsuite/gas/i386/x86-64-align-branch-1b.d: Likewise.
        * testsuite/gas/i386/x86-64-align-branch-1c.d: Likewise.
        * testsuite/gas/i386/x86-64-align-branch-1d.d: Likewise.
        * testsuite/gas/i386/x86-64-align-branch-1e.d: Likewise.
        * testsuite/gas/i386/x86-64-align-branch-1f.d: Likewise.
        * testsuite/gas/i386/x86-64-align-branch-1g.d: Likewise.
        * testsuite/gas/i386/x86-64-align-branch-1h.d: Likewise.
        * testsuite/gas/i386/x86-64-align-branch-2.s: Likewise.
        * testsuite/gas/i386/x86-64-align-branch-2a.d: Likewise.
        * testsuite/gas/i386/x86-64-align-branch-2b.d: Likewise.
        * testsuite/gas/i386/x86-64-align-branch-2c.d: Likewise.
        * testsuite/gas/i386/x86-64-align-branch-3.d: Likewise.
        * testsuite/gas/i386/x86-64-align-branch-3.s: Likewise.
        * testsuite/gas/i386/x86-64-align-branch-4.s: Likewise.
        * testsuite/gas/i386/x86-64-align-branch-4a.d: Likewise.
        * testsuite/gas/i386/x86-64-align-branch-4b.d: Likewise.
        * testsuite/gas/i386/x86-64-align-branch-5.d: Likewise.
        * testsuite/gas/i386/x86-64-align-branch-6.d: Likewise.
        * testsuite/gas/i386/x86-64-align-branch-7.d: Likewise.
        * testsuite/gas/i386/x86-64-align-branch-7.s: Likewise.
        * testsuite/gas/i386/x86-64-align-branch-8.d: Likewise.
        * testsuite/gas/i386/x86-64-align-branch-8.s: Likewise.
        * testsuite/gas/i386/i386.exp: Run new tests.

ld/

        * testsuite/ld-i386/align-branch-1.d: New file.
        * testsuite/ld-i386/align-branch-1.s: Likewise.
        * testsuite/ld-x86-64/align-branch-1.d: Likewise.
        * testsuite/ld-x86-64/align-branch-1.3: Likewise.
        * testsuite/ld-i386/i386.exp: Run the new test.
        * testsuite/ld-x86-64/x86-64.exp: Likewise.
---
 gas/testsuite/gas/i386/align-branch-1.s       | 72 +++++++++++++++++
 gas/testsuite/gas/i386/align-branch-1a.d      | 77 ++++++++++++++++++
 gas/testsuite/gas/i386/align-branch-1b.d      | 77 ++++++++++++++++++
 gas/testsuite/gas/i386/align-branch-1c.d      | 77 ++++++++++++++++++
 gas/testsuite/gas/i386/align-branch-1d.d      | 76 ++++++++++++++++++
 gas/testsuite/gas/i386/align-branch-1e.d      | 77 ++++++++++++++++++
 gas/testsuite/gas/i386/align-branch-1f.d      | 77 ++++++++++++++++++
 gas/testsuite/gas/i386/align-branch-1g.d      | 77 ++++++++++++++++++
 gas/testsuite/gas/i386/align-branch-1h.d      | 76 ++++++++++++++++++
 gas/testsuite/gas/i386/align-branch-1i.d      | 80 +++++++++++++++++++
 gas/testsuite/gas/i386/align-branch-2.s       | 49 ++++++++++++
 gas/testsuite/gas/i386/align-branch-2a.d      | 55 +++++++++++++
 gas/testsuite/gas/i386/align-branch-2b.d      | 55 +++++++++++++
 gas/testsuite/gas/i386/align-branch-2c.d      | 55 +++++++++++++
 gas/testsuite/gas/i386/align-branch-3.d       | 33 ++++++++
 gas/testsuite/gas/i386/align-branch-3.s       | 28 +++++++
 gas/testsuite/gas/i386/align-branch-4.s       | 30 +++++++
 gas/testsuite/gas/i386/align-branch-4a.d      | 36 +++++++++
 gas/testsuite/gas/i386/align-branch-4b.d      | 36 +++++++++
 gas/testsuite/gas/i386/align-branch-5.d       | 36 +++++++++
 gas/testsuite/gas/i386/align-branch-5.s       | 32 ++++++++
 gas/testsuite/gas/i386/align-branch-6.d       | 22 +++++
 gas/testsuite/gas/i386/align-branch-6.e       |  2 +
 gas/testsuite/gas/i386/align-branch-6.s       |  7 ++
 gas/testsuite/gas/i386/align-branch-7.d       | 18 +++++
 gas/testsuite/gas/i386/align-branch-7.s       | 14 ++++
 gas/testsuite/gas/i386/align-branch-8.d       | 18 +++++
 gas/testsuite/gas/i386/align-branch-8.s       | 14 ++++
 gas/testsuite/gas/i386/i386.exp               | 45 +++++++++++
 .../gas/i386/x86-64-align-branch-1.s          | 70 ++++++++++++++++
 .../gas/i386/x86-64-align-branch-1a.d         | 75 +++++++++++++++++
 .../gas/i386/x86-64-align-branch-1b.d         | 75 +++++++++++++++++
 .../gas/i386/x86-64-align-branch-1c.d         | 75 +++++++++++++++++
 .../gas/i386/x86-64-align-branch-1d.d         | 74 +++++++++++++++++
 .../gas/i386/x86-64-align-branch-1e.d         | 74 +++++++++++++++++
 .../gas/i386/x86-64-align-branch-1f.d         | 75 +++++++++++++++++
 .../gas/i386/x86-64-align-branch-1g.d         | 75 +++++++++++++++++
 .../gas/i386/x86-64-align-branch-1h.d         | 74 +++++++++++++++++
 .../gas/i386/x86-64-align-branch-1i.d         | 78 ++++++++++++++++++
 .../gas/i386/x86-64-align-branch-2.s          | 44 ++++++++++
 .../gas/i386/x86-64-align-branch-2a.d         | 50 ++++++++++++
 .../gas/i386/x86-64-align-branch-2b.d         | 50 ++++++++++++
 .../gas/i386/x86-64-align-branch-2c.d         | 50 ++++++++++++
 .../gas/i386/x86-64-align-branch-3.d          | 32 ++++++++
 .../gas/i386/x86-64-align-branch-3.s          | 27 +++++++
 .../gas/i386/x86-64-align-branch-4.s          | 27 +++++++
 .../gas/i386/x86-64-align-branch-4a.d         | 33 ++++++++
 .../gas/i386/x86-64-align-branch-4b.d         | 33 ++++++++
 .../gas/i386/x86-64-align-branch-5.d          | 37 +++++++++
 .../gas/i386/x86-64-align-branch-6.d          | 19 +++++
 .../gas/i386/x86-64-align-branch-7.d          | 18 +++++
 .../gas/i386/x86-64-align-branch-7.s          | 14 ++++
 .../gas/i386/x86-64-align-branch-8.d          | 18 +++++
 .../gas/i386/x86-64-align-branch-8.s          | 14 ++++
 ld/testsuite/ld-i386/align-branch-1.d         | 25 ++++++
 ld/testsuite/ld-i386/align-branch-1.s         | 19 +++++
 ld/testsuite/ld-i386/i386.exp                 |  1 +
 ld/testsuite/ld-x86-64/align-branch-1.d       | 21 +++++
 ld/testsuite/ld-x86-64/align-branch-1.s       | 17 ++++
 ld/testsuite/ld-x86-64/x86-64.exp             |  1 +
 60 files changed, 2646 insertions(+)
 create mode 100644 gas/testsuite/gas/i386/align-branch-1.s
 create mode 100644 gas/testsuite/gas/i386/align-branch-1a.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-1b.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-1c.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-1d.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-1e.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-1f.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-1g.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-1h.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-1i.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-2.s
 create mode 100644 gas/testsuite/gas/i386/align-branch-2a.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-2b.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-2c.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-3.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-3.s
 create mode 100644 gas/testsuite/gas/i386/align-branch-4.s
 create mode 100644 gas/testsuite/gas/i386/align-branch-4a.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-4b.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-5.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-5.s
 create mode 100644 gas/testsuite/gas/i386/align-branch-6.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-6.e
 create mode 100644 gas/testsuite/gas/i386/align-branch-6.s
 create mode 100644 gas/testsuite/gas/i386/align-branch-7.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-7.s
 create mode 100644 gas/testsuite/gas/i386/align-branch-8.d
 create mode 100644 gas/testsuite/gas/i386/align-branch-8.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-1.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-1a.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-1b.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-1c.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-1d.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-1e.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-1f.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-1g.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-1h.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-1i.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-2.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-2a.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-2b.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-2c.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-3.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-3.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-4.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-4a.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-4b.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-5.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-6.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-7.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-7.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-8.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-align-branch-8.s
 create mode 100644 ld/testsuite/ld-i386/align-branch-1.d
 create mode 100644 ld/testsuite/ld-i386/align-branch-1.s
 create mode 100644 ld/testsuite/ld-x86-64/align-branch-1.d
 create mode 100644 ld/testsuite/ld-x86-64/align-branch-1.s

diff --git a/gas/testsuite/gas/i386/align-branch-1.s b/gas/testsuite/gas/i386/align-branch-1.s
new file mode 100644
index 0000000000..06bf98a98d
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-1.s
@@ -0,0 +1,72 @@
+  .text
+  .globl  foo
+  .p2align  4
+foo:
+  movl %eax, %gs:0x1
+  pushl  %ebp
+  pushl  %ebp
+  pushl  %ebp
+  pushl  %ebp
+  movl  %esp, %ebp
+  movl  %edi, -8(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  cmp  %eax, %ebp
+  je  .L_2
+  movl  %esi, -12(%ebx)
+  movl  %esi, -12(%ebp)
+  movl  %edi, -8(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  popl  %ebp
+  popl  %ebp
+  popl  %ebp
+  je  .L_2
+  popl  %ebp
+  je  .L_2
+  movl  %eax, -4(%esp)
+  movl  %esi, -12(%ebp)
+  movl  %edi, -8(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  popl  %ebp
+  jmp  .L_3
+  jmp  .L_3
+  jmp  .L_3
+  movl  %eax, -4(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %edi, -8(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  popl  %ebp
+  popl  %ebp
+  cmp  %eax, %ebp
+  je  .L_2
+  jmp  .L_3
+.L_2:
+  movl  -12(%ebp), %eax
+  movl  %eax, -4(%ebp)
+.L_3:
+  movl  %esi, -1200(%ebp)
+  movl  %esi, -1200(%ebp)
+  movl  %esi, -1200(%ebp)
+  movl  %esi, -1200(%ebp)
+  movl  %esi, 12(%ebp)
+  jmp  bar
+  movl  %esi, -1200(%ebp)
+  movl  %esi, -1200(%ebp)
+  movl  %esi, -1200(%ebp)
+  movl  %esi, -1200(%ebp)
+  movl  %esi, (%ebp)
+  je .L_3
+  je .L_3
diff --git a/gas/testsuite/gas/i386/align-branch-1a.d b/gas/testsuite/gas/i386/align-branch-1a.d
new file mode 100644
index 0000000000..46b79216ec
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-1a.d
@@ -0,0 +1,77 @@
+#source: align-branch-1.s
+#as: -malign-branch-boundary=32
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 65 65 65 a3 01 00 00 00 gs gs mov %eax,%gs:0x1
+   8: 55                   push   %ebp
+   9: 55                   push   %ebp
+   a: 55                   push   %ebp
+   b: 55                   push   %ebp
+   c: 89 e5                 mov    %esp,%ebp
+   e: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  11: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  14: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  17: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1a: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1d: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  20: 39 c5                 cmp    %eax,%ebp
+  22: 74 5e                 je     82 <foo\+0x82>
+  24: 3e 89 73 f4           mov    %esi,%ds:-0xc\(%ebx\)
+  28: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  2b: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  2e: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  31: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  34: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  37: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  3a: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  3d: 5d                   pop    %ebp
+  3e: 5d                   pop    %ebp
+  3f: 5d                   pop    %ebp
+  40: 74 40                 je     82 <foo\+0x82>
+  42: 5d                   pop    %ebp
+  43: 74 3d                 je     82 <foo\+0x82>
+  45: 36 89 44 24 fc       mov    %eax,%ss:-0x4\(%esp\)
+  4a: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  4d: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  50: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  53: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  56: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  59: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  5c: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  5f: 5d                   pop    %ebp
+  60: eb 26                 jmp    88 <foo\+0x88>
+  62: eb 24                 jmp    88 <foo\+0x88>
+  64: eb 22                 jmp    88 <foo\+0x88>
+  66: 89 45 fc             mov    %eax,-0x4\(%ebp\)
+  69: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  6c: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  6f: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  72: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  75: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  78: 5d                   pop    %ebp
+  79: 5d                   pop    %ebp
+  7a: 39 c5                 cmp    %eax,%ebp
+  7c: 74 04                 je     82 <foo\+0x82>
+  7e: 66 90                 xchg   %ax,%ax
+  80: eb 06                 jmp    88 <foo\+0x88>
+  82: 8b 45 f4             mov    -0xc\(%ebp\),%eax
+  85: 89 45 fc             mov    %eax,-0x4\(%ebp\)
+  88: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  8e: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  94: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  9a: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  a0: 89 75 0c             mov    %esi,0xc\(%ebp\)
+  a3: e9 [0-9a-f ]+       jmp    .*
+  a8: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  ae: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  b4: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  ba: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  c0: 89 75 00             mov    %esi,0x0\(%ebp\)
+  c3: 74 c3                 je     88 <foo\+0x88>
+  c5: 74 c1                 je     88 <foo\+0x88>
+#pass
diff --git a/gas/testsuite/gas/i386/align-branch-1b.d b/gas/testsuite/gas/i386/align-branch-1b.d
new file mode 100644
index 0000000000..b3f0e727bc
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-1b.d
@@ -0,0 +1,77 @@
+#source: align-branch-1.s
+#as: -malign-branch-boundary=32 -malign-branch=fused+jcc+jmp
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 65 65 65 a3 01 00 00 00 gs gs mov %eax,%gs:0x1
+   8: 55                   push   %ebp
+   9: 55                   push   %ebp
+   a: 55                   push   %ebp
+   b: 55                   push   %ebp
+   c: 89 e5                 mov    %esp,%ebp
+   e: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  11: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  14: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  17: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1a: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1d: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  20: 39 c5                 cmp    %eax,%ebp
+  22: 74 5e                 je     82 <foo\+0x82>
+  24: 3e 89 73 f4           mov    %esi,%ds:-0xc\(%ebx\)
+  28: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  2b: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  2e: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  31: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  34: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  37: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  3a: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  3d: 5d                   pop    %ebp
+  3e: 5d                   pop    %ebp
+  3f: 5d                   pop    %ebp
+  40: 74 40                 je     82 <foo\+0x82>
+  42: 5d                   pop    %ebp
+  43: 74 3d                 je     82 <foo\+0x82>
+  45: 36 89 44 24 fc       mov    %eax,%ss:-0x4\(%esp\)
+  4a: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  4d: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  50: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  53: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  56: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  59: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  5c: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  5f: 5d                   pop    %ebp
+  60: eb 26                 jmp    88 <foo\+0x88>
+  62: eb 24                 jmp    88 <foo\+0x88>
+  64: eb 22                 jmp    88 <foo\+0x88>
+  66: 89 45 fc             mov    %eax,-0x4\(%ebp\)
+  69: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  6c: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  6f: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  72: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  75: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  78: 5d                   pop    %ebp
+  79: 5d                   pop    %ebp
+  7a: 39 c5                 cmp    %eax,%ebp
+  7c: 74 04                 je     82 <foo\+0x82>
+  7e: 66 90                 xchg   %ax,%ax
+  80: eb 06                 jmp    88 <foo\+0x88>
+  82: 8b 45 f4             mov    -0xc\(%ebp\),%eax
+  85: 89 45 fc             mov    %eax,-0x4\(%ebp\)
+  88: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  8e: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  94: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  9a: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  a0: 89 75 0c             mov    %esi,0xc\(%ebp\)
+  a3: e9 [0-9a-f ]+       jmp    .*
+  a8: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  ae: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  b4: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  ba: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  c0: 89 75 00             mov    %esi,0x0\(%ebp\)
+  c3: 74 c3                 je     88 <foo\+0x88>
+  c5: 74 c1                 je     88 <foo\+0x88>
+#pass
diff --git a/gas/testsuite/gas/i386/align-branch-1c.d b/gas/testsuite/gas/i386/align-branch-1c.d
new file mode 100644
index 0000000000..947dcc8785
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-1c.d
@@ -0,0 +1,77 @@
+#source: align-branch-1.s
+#as: -malign-branch-boundary=32 -malign-branch-prefix-size=1
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 65 a3 01 00 00 00     mov    %eax,%gs:0x1
+   6: 3e 55                 ds push %ebp
+   8: 3e 55                 ds push %ebp
+   a: 55                   push   %ebp
+   b: 55                   push   %ebp
+   c: 89 e5                 mov    %esp,%ebp
+   e: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  11: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  14: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  17: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1a: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1d: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  20: 39 c5                 cmp    %eax,%ebp
+  22: 74 5e                 je     82 <foo\+0x82>
+  24: 3e 89 73 f4           mov    %esi,%ds:-0xc\(%ebx\)
+  28: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  2b: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  2e: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  31: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  34: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  37: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  3a: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  3d: 5d                   pop    %ebp
+  3e: 5d                   pop    %ebp
+  3f: 5d                   pop    %ebp
+  40: 74 40                 je     82 <foo\+0x82>
+  42: 5d                   pop    %ebp
+  43: 74 3d                 je     82 <foo\+0x82>
+  45: 36 89 44 24 fc       mov    %eax,%ss:-0x4\(%esp\)
+  4a: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  4d: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  50: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  53: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  56: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  59: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  5c: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  5f: 5d                   pop    %ebp
+  60: eb 26                 jmp    88 <foo\+0x88>
+  62: eb 24                 jmp    88 <foo\+0x88>
+  64: eb 22                 jmp    88 <foo\+0x88>
+  66: 89 45 fc             mov    %eax,-0x4\(%ebp\)
+  69: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  6c: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  6f: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  72: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  75: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  78: 5d                   pop    %ebp
+  79: 5d                   pop    %ebp
+  7a: 39 c5                 cmp    %eax,%ebp
+  7c: 74 04                 je     82 <foo\+0x82>
+  7e: 66 90                 xchg   %ax,%ax
+  80: eb 06                 jmp    88 <foo\+0x88>
+  82: 8b 45 f4             mov    -0xc\(%ebp\),%eax
+  85: 89 45 fc             mov    %eax,-0x4\(%ebp\)
+  88: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  8e: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  94: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  9a: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  a0: 89 75 0c             mov    %esi,0xc\(%ebp\)
+  a3: e9 [0-9a-f ]+       jmp    .*
+  a8: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  ae: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  b4: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  ba: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  c0: 89 75 00             mov    %esi,0x0\(%ebp\)
+  c3: 74 c3                 je     88 <foo\+0x88>
+  c5: 74 c1                 je     88 <foo\+0x88>
+#pass
diff --git a/gas/testsuite/gas/i386/align-branch-1d.d b/gas/testsuite/gas/i386/align-branch-1d.d
new file mode 100644
index 0000000000..db62f0819d
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-1d.d
@@ -0,0 +1,76 @@
+#source: align-branch-1.s
+#as: -malign-branch-boundary=32 -malign-branch=fused+jcc
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 65 65 65 a3 01 00 00 00 gs gs mov %eax,%gs:0x1
+   8: 55                   push   %ebp
+   9: 55                   push   %ebp
+   a: 55                   push   %ebp
+   b: 55                   push   %ebp
+   c: 89 e5                 mov    %esp,%ebp
+   e: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  11: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  14: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  17: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1a: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1d: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  20: 39 c5                 cmp    %eax,%ebp
+  22: 74 5b                 je     7f <foo\+0x7f>
+  24: 3e 89 73 f4           mov    %esi,%ds:-0xc\(%ebx\)
+  28: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  2b: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  2e: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  31: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  34: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  37: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  3a: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  3d: 5d                   pop    %ebp
+  3e: 5d                   pop    %ebp
+  3f: 5d                   pop    %ebp
+  40: 74 3d                 je     7f <foo\+0x7f>
+  42: 5d                   pop    %ebp
+  43: 74 3a                 je     7f <foo\+0x7f>
+  45: 89 44 24 fc           mov    %eax,-0x4\(%esp\)
+  49: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  4c: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  4f: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  52: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  55: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  58: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  5b: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  5e: 5d                   pop    %ebp
+  5f: eb 24                 jmp    85 <foo\+0x85>
+  61: eb 22                 jmp    85 <foo\+0x85>
+  63: eb 20                 jmp    85 <foo\+0x85>
+  65: 89 45 fc             mov    %eax,-0x4\(%ebp\)
+  68: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  6b: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  6e: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  71: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  74: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  77: 5d                   pop    %ebp
+  78: 5d                   pop    %ebp
+  79: 39 c5                 cmp    %eax,%ebp
+  7b: 74 02                 je     7f <foo\+0x7f>
+  7d: eb 06                 jmp    85 <foo\+0x85>
+  7f: 8b 45 f4             mov    -0xc\(%ebp\),%eax
+  82: 89 45 fc             mov    %eax,-0x4\(%ebp\)
+  85: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  8b: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  91: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  97: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  9d: 89 75 0c             mov    %esi,0xc\(%ebp\)
+  a0: e9 [0-9a-f ]+       jmp    .*
+  a5: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  ab: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  b1: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  b7: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  bd: 89 75 00             mov    %esi,0x0\(%ebp\)
+  c0: 74 c3                 je     85 <foo\+0x85>
+  c2: 74 c1                 je     85 <foo\+0x85>
+#pass
diff --git a/gas/testsuite/gas/i386/align-branch-1e.d b/gas/testsuite/gas/i386/align-branch-1e.d
new file mode 100644
index 0000000000..dafbee13f1
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-1e.d
@@ -0,0 +1,77 @@
+#source: align-branch-1.s
+#as: -malign-branch-boundary=32 -malign-branch=jcc
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 65 a3 01 00 00 00     mov    %eax,%gs:0x1
+   6: 55                   push   %ebp
+   7: 55                   push   %ebp
+   8: 55                   push   %ebp
+   9: 55                   push   %ebp
+   a: 89 e5                 mov    %esp,%ebp
+   c: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+   f: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  12: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  15: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  18: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1b: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1e: 39 c5                 cmp    %eax,%ebp
+  20: 74 5a                 je     7c <foo\+0x7c>
+  22: 89 73 f4             mov    %esi,-0xc\(%ebx\)
+  25: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  28: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  2b: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  2e: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  31: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  34: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  37: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  3a: 5d                   pop    %ebp
+  3b: 5d                   pop    %ebp
+  3c: 5d                   pop    %ebp
+  3d: 74 3d                 je     7c <foo\+0x7c>
+  3f: 5d                   pop    %ebp
+  40: 74 3a                 je     7c <foo\+0x7c>
+  42: 89 44 24 fc           mov    %eax,-0x4\(%esp\)
+  46: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  49: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  4c: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  4f: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  52: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  55: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  58: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  5b: 5d                   pop    %ebp
+  5c: eb 24                 jmp    82 <foo\+0x82>
+  5e: eb 22                 jmp    82 <foo\+0x82>
+  60: eb 20                 jmp    82 <foo\+0x82>
+  62: 89 45 fc             mov    %eax,-0x4\(%ebp\)
+  65: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  68: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  6b: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  6e: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  71: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  74: 5d                   pop    %ebp
+  75: 5d                   pop    %ebp
+  76: 39 c5                 cmp    %eax,%ebp
+  78: 74 02                 je     7c <foo\+0x7c>
+  7a: eb 06                 jmp    82 <foo\+0x82>
+  7c: 8b 45 f4             mov    -0xc\(%ebp\),%eax
+  7f: 89 45 fc             mov    %eax,-0x4\(%ebp\)
+  82: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  88: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  8e: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  94: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  9a: 89 75 0c             mov    %esi,0xc\(%ebp\)
+  9d: e9 [0-9a-f ]+       jmp    .*
+  a2: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  a8: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  ae: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  b4: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  ba: 89 75 00             mov    %esi,0x0\(%ebp\)
+  bd: 74 c3                 je     82 <foo\+0x82>
+  bf: 90                   nop
+  c0: 74 c0                 je     82 <foo\+0x82>
+#pass
diff --git a/gas/testsuite/gas/i386/align-branch-1f.d b/gas/testsuite/gas/i386/align-branch-1f.d
new file mode 100644
index 0000000000..bf197c979b
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-1f.d
@@ -0,0 +1,77 @@
+#source: align-branch-1.s
+#as: -malign-branch-boundary=32 -malign-branch=jcc+jmp
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 65 a3 01 00 00 00     mov    %eax,%gs:0x1
+   6: 55                   push   %ebp
+   7: 55                   push   %ebp
+   8: 55                   push   %ebp
+   9: 55                   push   %ebp
+   a: 89 e5                 mov    %esp,%ebp
+   c: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+   f: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  12: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  15: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  18: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1b: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1e: 39 c5                 cmp    %eax,%ebp
+  20: 74 5c                 je     7e <foo\+0x7e>
+  22: 89 73 f4             mov    %esi,-0xc\(%ebx\)
+  25: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  28: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  2b: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  2e: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  31: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  34: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  37: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  3a: 5d                   pop    %ebp
+  3b: 5d                   pop    %ebp
+  3c: 5d                   pop    %ebp
+  3d: 74 3f                 je     7e <foo\+0x7e>
+  3f: 5d                   pop    %ebp
+  40: 74 3c                 je     7e <foo\+0x7e>
+  42: 89 44 24 fc           mov    %eax,-0x4\(%esp\)
+  46: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  49: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  4c: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  4f: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  52: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  55: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  58: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  5b: 5d                   pop    %ebp
+  5c: eb 27                 jmp    85 <foo\+0x85>
+  5e: 66 90                 xchg   %ax,%ax
+  60: eb 23                 jmp    85 <foo\+0x85>
+  62: eb 21                 jmp    85 <foo\+0x85>
+  64: 89 45 fc             mov    %eax,-0x4\(%ebp\)
+  67: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  6a: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  6d: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  70: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  73: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  76: 5d                   pop    %ebp
+  77: 5d                   pop    %ebp
+  78: 39 c5                 cmp    %eax,%ebp
+  7a: 74 02                 je     7e <foo\+0x7e>
+  7c: eb 07                 jmp    85 <foo\+0x85>
+  7e: 36 8b 45 f4           mov    %ss:-0xc\(%ebp\),%eax
+  82: 89 45 fc             mov    %eax,-0x4\(%ebp\)
+  85: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  8b: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  91: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  97: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  9d: 89 75 0c             mov    %esi,0xc\(%ebp\)
+  a0: e9 [0-9a-f ]+       jmp    .*
+  a5: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  ab: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  b1: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  b7: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  bd: 89 75 00             mov    %esi,0x0\(%ebp\)
+  c0: 74 c3                 je     85 <foo\+0x85>
+  c2: 74 c1                 je     85 <foo\+0x85>
+#pass
diff --git a/gas/testsuite/gas/i386/align-branch-1g.d b/gas/testsuite/gas/i386/align-branch-1g.d
new file mode 100644
index 0000000000..6cae2cd5f4
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-1g.d
@@ -0,0 +1,77 @@
+#source: align-branch-1.s
+#as: -mbranches-within-32B-boundaries
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 65 65 65 a3 01 00 00 00 gs gs mov %eax,%gs:0x1
+   8: 55                   push   %ebp
+   9: 55                   push   %ebp
+   a: 55                   push   %ebp
+   b: 55                   push   %ebp
+   c: 89 e5                 mov    %esp,%ebp
+   e: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  11: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  14: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  17: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1a: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1d: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  20: 39 c5                 cmp    %eax,%ebp
+  22: 74 5e                 je     82 <foo\+0x82>
+  24: 3e 89 73 f4           mov    %esi,%ds:-0xc\(%ebx\)
+  28: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  2b: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  2e: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  31: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  34: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  37: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  3a: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  3d: 5d                   pop    %ebp
+  3e: 5d                   pop    %ebp
+  3f: 5d                   pop    %ebp
+  40: 74 40                 je     82 <foo\+0x82>
+  42: 5d                   pop    %ebp
+  43: 74 3d                 je     82 <foo\+0x82>
+  45: 36 89 44 24 fc       mov    %eax,%ss:-0x4\(%esp\)
+  4a: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  4d: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  50: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  53: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  56: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  59: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  5c: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  5f: 5d                   pop    %ebp
+  60: eb 26                 jmp    88 <foo\+0x88>
+  62: eb 24                 jmp    88 <foo\+0x88>
+  64: eb 22                 jmp    88 <foo\+0x88>
+  66: 89 45 fc             mov    %eax,-0x4\(%ebp\)
+  69: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  6c: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  6f: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  72: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  75: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  78: 5d                   pop    %ebp
+  79: 5d                   pop    %ebp
+  7a: 39 c5                 cmp    %eax,%ebp
+  7c: 74 04                 je     82 <foo\+0x82>
+  7e: 66 90                 xchg   %ax,%ax
+  80: eb 06                 jmp    88 <foo\+0x88>
+  82: 8b 45 f4             mov    -0xc\(%ebp\),%eax
+  85: 89 45 fc             mov    %eax,-0x4\(%ebp\)
+  88: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  8e: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  94: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  9a: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  a0: 89 75 0c             mov    %esi,0xc\(%ebp\)
+  a3: e9 [0-9a-f ]+       jmp    .*
+  a8: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  ae: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  b4: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  ba: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  c0: 89 75 00             mov    %esi,0x0\(%ebp\)
+  c3: 74 c3                 je     88 <foo\+0x88>
+  c5: 74 c1                 je     88 <foo\+0x88>
+#pass
diff --git a/gas/testsuite/gas/i386/align-branch-1h.d b/gas/testsuite/gas/i386/align-branch-1h.d
new file mode 100644
index 0000000000..01871ee98f
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-1h.d
@@ -0,0 +1,76 @@
+#source: align-branch-1.s
+#as: -mbranches-within-32B-boundaries -malign-branch-boundary=0
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 65 a3 01 00 00 00     mov    %eax,%gs:0x1
+   6: 55                   push   %ebp
+   7: 55                   push   %ebp
+   8: 55                   push   %ebp
+   9: 55                   push   %ebp
+   a: 89 e5                 mov    %esp,%ebp
+   c: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+   f: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  12: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  15: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  18: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1b: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1e: 39 c5                 cmp    %eax,%ebp
+  20: 74 5a                 je     7c <foo\+0x7c>
+  22: 89 73 f4             mov    %esi,-0xc\(%ebx\)
+  25: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  28: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  2b: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  2e: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  31: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  34: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  37: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  3a: 5d                   pop    %ebp
+  3b: 5d                   pop    %ebp
+  3c: 5d                   pop    %ebp
+  3d: 74 3d                 je     7c <foo\+0x7c>
+  3f: 5d                   pop    %ebp
+  40: 74 3a                 je     7c <foo\+0x7c>
+  42: 89 44 24 fc           mov    %eax,-0x4\(%esp\)
+  46: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  49: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  4c: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  4f: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  52: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  55: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  58: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  5b: 5d                   pop    %ebp
+  5c: eb 24                 jmp    82 <foo\+0x82>
+  5e: eb 22                 jmp    82 <foo\+0x82>
+  60: eb 20                 jmp    82 <foo\+0x82>
+  62: 89 45 fc             mov    %eax,-0x4\(%ebp\)
+  65: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  68: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  6b: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  6e: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  71: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  74: 5d                   pop    %ebp
+  75: 5d                   pop    %ebp
+  76: 39 c5                 cmp    %eax,%ebp
+  78: 74 02                 je     7c <foo\+0x7c>
+  7a: eb 06                 jmp    82 <foo\+0x82>
+  7c: 8b 45 f4             mov    -0xc\(%ebp\),%eax
+  7f: 89 45 fc             mov    %eax,-0x4\(%ebp\)
+  82: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  88: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  8e: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  94: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  9a: 89 75 0c             mov    %esi,0xc\(%ebp\)
+  9d: e9 [0-9a-f ]+       jmp    .*
+  a2: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  a8: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  ae: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  b4: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  ba: 89 75 00             mov    %esi,0x0\(%ebp\)
+  bd: 74 c3                 je     82 <foo\+0x82>
+  bf: 74 c1                 je     82 <foo\+0x82>
+#pass
diff --git a/gas/testsuite/gas/i386/align-branch-1i.d b/gas/testsuite/gas/i386/align-branch-1i.d
new file mode 100644
index 0000000000..e2cbc28cde
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-1i.d
@@ -0,0 +1,80 @@
+#source: align-branch-1.s
+#as: -malign-branch-boundary=32 -malign-branch-prefix-size=0
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 65 a3 01 00 00 00     mov    %eax,%gs:0x1
+   6: 55                   push   %ebp
+   7: 55                   push   %ebp
+   8: 55                   push   %ebp
+   9: 55                   push   %ebp
+   a: 89 e5                 mov    %esp,%ebp
+   c: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+   f: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  12: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  15: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  18: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1b: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1e: 66 90                 xchg   %ax,%ax
+  20: 39 c5                 cmp    %eax,%ebp
+  22: 74 5e                 je     82 <foo\+0x82>
+  24: 89 73 f4             mov    %esi,-0xc\(%ebx\)
+  27: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  2a: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  2d: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  30: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  33: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  36: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  39: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  3c: 5d                   pop    %ebp
+  3d: 5d                   pop    %ebp
+  3e: 5d                   pop    %ebp
+  3f: 90                   nop
+  40: 74 40                 je     82 <foo\+0x82>
+  42: 5d                   pop    %ebp
+  43: 74 3d                 je     82 <foo\+0x82>
+  45: 89 44 24 fc           mov    %eax,-0x4\(%esp\)
+  49: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  4c: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  4f: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  52: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  55: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  58: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  5b: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  5e: 5d                   pop    %ebp
+  5f: 90                   nop
+  60: eb 26                 jmp    88 <foo\+0x88>
+  62: eb 24                 jmp    88 <foo\+0x88>
+  64: eb 22                 jmp    88 <foo\+0x88>
+  66: 89 45 fc             mov    %eax,-0x4\(%ebp\)
+  69: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  6c: 89 7d f8             mov    %edi,-0x8\(%ebp\)
+  6f: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  72: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  75: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  78: 5d                   pop    %ebp
+  79: 5d                   pop    %ebp
+  7a: 39 c5                 cmp    %eax,%ebp
+  7c: 74 04                 je     82 <foo\+0x82>
+  7e: 66 90                 xchg   %ax,%ax
+  80: eb 06                 jmp    88 <foo\+0x88>
+  82: 8b 45 f4             mov    -0xc\(%ebp\),%eax
+  85: 89 45 fc             mov    %eax,-0x4\(%ebp\)
+  88: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  8e: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  94: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  9a: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  a0: 89 75 0c             mov    %esi,0xc\(%ebp\)
+  a3: e9 [0-9a-f ]+       jmp    .*
+  a8: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  ae: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  b4: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  ba: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%ebp\)
+  c0: 89 75 00             mov    %esi,0x0\(%ebp\)
+  c3: 74 c3                 je     88 <foo\+0x88>
+  c5: 74 c1                 je     88 <foo\+0x88>
+#pass
diff --git a/gas/testsuite/gas/i386/align-branch-2.s b/gas/testsuite/gas/i386/align-branch-2.s
new file mode 100644
index 0000000000..4a79bbb082
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-2.s
@@ -0,0 +1,49 @@
+  .text
+  .globl  foo
+  .p2align  4
+foo:
+  movl  %eax, %fs:0x1
+  pushl  %ebp
+  pushl  %ebp
+  pushl  %ebp
+  pushl  %ebp
+  movl  %esp, %ebp
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  jmp  *%eax
+  pushl  %ebp
+  pushl  %ebp
+  movl  %eax, %fs:0x1
+  movl  %esp, %ebp
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  call *%eax
+  movl  %esi, -12(%ebp)
+  pushl  %ebp
+  pushl  %ebp
+  movl  %eax, %fs:0x1
+  movl  %esp, %ebp
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  call  foo
+  movl  %esi, -12(%ebp)
+  pushl  %ebp
+  pushl  %ebp
+  pushl  %ebp
+  pushl  %ebp
+  movl  %eax, %fs:0x1
+  movl  %esp, %ebp
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  call  *foo
+  pushl  %ebp
diff --git a/gas/testsuite/gas/i386/align-branch-2a.d b/gas/testsuite/gas/i386/align-branch-2a.d
new file mode 100644
index 0000000000..cba0560d9c
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-2a.d
@@ -0,0 +1,55 @@
+#source: align-branch-2.s
+#as: -malign-branch-boundary=32 -malign-branch=fused+jcc+jmp
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 64 a3 01 00 00 00     mov    %eax,%fs:0x1
+   6: 55                   push   %ebp
+   7: 55                   push   %ebp
+   8: 55                   push   %ebp
+   9: 55                   push   %ebp
+   a: 89 e5                 mov    %esp,%ebp
+   c: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+   f: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  12: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  15: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  18: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1b: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1e: ff e0                 jmp    \*%eax
+  20: 55                   push   %ebp
+  21: 55                   push   %ebp
+  22: 64 a3 01 00 00 00     mov    %eax,%fs:0x1
+  28: 89 e5                 mov    %esp,%ebp
+  2a: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  2d: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  30: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  33: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  36: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  39: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  3c: ff d0                 call   \*%eax
+  3e: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  41: 55                   push   %ebp
+  42: 55                   push   %ebp
+  43: 64 a3 01 00 00 00     mov    %eax,%fs:0x1
+  49: 89 e5                 mov    %esp,%ebp
+  4b: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  4e: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  51: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  54: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  57: e8 [0-9a-f ]+       call   .*
+  5c: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  5f: 55                   push   %ebp
+  60: 55                   push   %ebp
+  61: 55                   push   %ebp
+  62: 55                   push   %ebp
+  63: 64 a3 01 00 00 00     mov    %eax,%fs:0x1
+  69: 89 e5                 mov    %esp,%ebp
+  6b: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  6e: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  71: ff 15 00 00 00 00     call   \*0x0
+  77: 55                   push   %ebp
+#pass
diff --git a/gas/testsuite/gas/i386/align-branch-2b.d b/gas/testsuite/gas/i386/align-branch-2b.d
new file mode 100644
index 0000000000..7d879b6ba5
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-2b.d
@@ -0,0 +1,55 @@
+#source: align-branch-2.s
+#as: -malign-branch-boundary=32 -malign-branch=indirect
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 64 64 64 a3 01 00 00 00 fs fs mov %eax,%fs:0x1
+   8: 55                   push   %ebp
+   9: 55                   push   %ebp
+   a: 55                   push   %ebp
+   b: 55                   push   %ebp
+   c: 89 e5                 mov    %esp,%ebp
+   e: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  11: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  14: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  17: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1a: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1d: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  20: ff e0                 jmp    \*%eax
+  22: 3e 3e 55             ds ds push %ebp
+  25: 55                   push   %ebp
+  26: 64 a3 01 00 00 00     mov    %eax,%fs:0x1
+  2c: 89 e5                 mov    %esp,%ebp
+  2e: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  31: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  34: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  37: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  3a: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  3d: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  40: ff d0                 call   \*%eax
+  42: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  45: 55                   push   %ebp
+  46: 55                   push   %ebp
+  47: 64 a3 01 00 00 00     mov    %eax,%fs:0x1
+  4d: 89 e5                 mov    %esp,%ebp
+  4f: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  52: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  55: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  58: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  5b: e8 [0-9a-f ]+       call   .*
+  60: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  63: 55                   push   %ebp
+  64: 55                   push   %ebp
+  65: 55                   push   %ebp
+  66: 55                   push   %ebp
+  67: 64 a3 01 00 00 00     mov    %eax,%fs:0x1
+  6d: 89 e5                 mov    %esp,%ebp
+  6f: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  72: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  75: ff 15 00 00 00 00     call   \*0x0
+  7b: 55                   push   %ebp
+#pass
diff --git a/gas/testsuite/gas/i386/align-branch-2c.d b/gas/testsuite/gas/i386/align-branch-2c.d
new file mode 100644
index 0000000000..2fc6339975
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-2c.d
@@ -0,0 +1,55 @@
+#source: align-branch-2.s
+#as: -malign-branch-boundary=32 -malign-branch=indirect+call
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 64 64 64 a3 01 00 00 00 fs fs mov %eax,%fs:0x1
+   8: 55                   push   %ebp
+   9: 55                   push   %ebp
+   a: 55                   push   %ebp
+   b: 55                   push   %ebp
+   c: 89 e5                 mov    %esp,%ebp
+   e: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  11: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  14: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  17: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1a: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1d: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  20: ff e0                 jmp    \*%eax
+  22: 3e 3e 55             ds ds push %ebp
+  25: 55                   push   %ebp
+  26: 64 a3 01 00 00 00     mov    %eax,%fs:0x1
+  2c: 89 e5                 mov    %esp,%ebp
+  2e: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  31: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  34: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  37: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  3a: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  3d: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  40: ff d0                 call   \*%eax
+  42: 36 36 36 36 36 89 75 f4 ss ss ss ss mov %esi,%ss:-0xc\(%ebp\)
+  4a: 55                   push   %ebp
+  4b: 55                   push   %ebp
+  4c: 64 a3 01 00 00 00     mov    %eax,%fs:0x1
+  52: 89 e5                 mov    %esp,%ebp
+  54: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  57: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  5a: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  5d: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  60: e8 [0-9a-f ]+       call   .*
+  65: 36 36 36 36 36 89 75 f4 ss ss ss ss mov %esi,%ss:-0xc\(%ebp\)
+  6d: 3e 55                 ds push %ebp
+  6f: 55                   push   %ebp
+  70: 55                   push   %ebp
+  71: 55                   push   %ebp
+  72: 64 a3 01 00 00 00     mov    %eax,%fs:0x1
+  78: 89 e5                 mov    %esp,%ebp
+  7a: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  7d: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  80: ff 15 00 00 00 00     call   \*0x0
+  86: 55                   push   %ebp
+#pass
diff --git a/gas/testsuite/gas/i386/align-branch-3.d b/gas/testsuite/gas/i386/align-branch-3.d
new file mode 100644
index 0000000000..da31b6f503
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-3.d
@@ -0,0 +1,33 @@
+#as: -malign-branch-boundary=32 -malign-branch=indirect+call
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 64 a3 01 00 00 00     mov    %eax,%fs:0x1
+   6: 55                   push   %ebp
+   7: 55                   push   %ebp
+   8: 55                   push   %ebp
+   9: 55                   push   %ebp
+   a: 89 e5                 mov    %esp,%ebp
+   c: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+   f: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  12: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  15: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  18: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1b: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1e: e8 fc ff ff ff       call   1f <foo\+0x1f>
+  23: 55                   push   %ebp
+  24: 55                   push   %ebp
+  25: 64 a3 01 00 00 00     mov    %eax,%fs:0x1
+  2b: 89 e5                 mov    %esp,%ebp
+  2d: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  30: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  33: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  36: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  39: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  3c: ff 91 00 00 00 00     call   \*0x0\(%ecx\)
+  42: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+#pass
diff --git a/gas/testsuite/gas/i386/align-branch-3.s b/gas/testsuite/gas/i386/align-branch-3.s
new file mode 100644
index 0000000000..e3e6c447c4
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-3.s
@@ -0,0 +1,28 @@
+  .text
+  .globl  foo
+  .p2align  4
+foo:
+  movl  %eax, %fs:0x1
+  pushl  %ebp
+  pushl  %ebp
+  pushl  %ebp
+  pushl  %ebp
+  movl  %esp, %ebp
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  call ___tls_get_addr
+  pushl  %ebp
+  pushl  %ebp
+  movl  %eax, %fs:0x1
+  movl  %esp, %ebp
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  call *___tls_get_addr@GOT(%ecx)
+  movl  %esi, -12(%ebp)
diff --git a/gas/testsuite/gas/i386/align-branch-4.s b/gas/testsuite/gas/i386/align-branch-4.s
new file mode 100644
index 0000000000..34ff361a7e
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-4.s
@@ -0,0 +1,30 @@
+  .text
+  .globl  foo
+  .p2align  4
+foo:
+  movl  %eax, %fs:0x1
+  pushl  %ebp
+  pushl  %ebp
+  pushl  %ebp
+  pushl  %ebp
+  pushl  %ebp
+  movl  %esp, %ebp
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  ret
+  pushl  %ebp
+  pushl  %ebp
+  movl  %eax, %fs:0x1
+  movl  %esp, %ebp
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  ret $30
+  movl  %esi, -12(%ebp)
diff --git a/gas/testsuite/gas/i386/align-branch-4a.d b/gas/testsuite/gas/i386/align-branch-4a.d
new file mode 100644
index 0000000000..2b1e0b1f45
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-4a.d
@@ -0,0 +1,36 @@
+#source: align-branch-4.s
+#as: -malign-branch-boundary=32 -malign-branch=fused+jcc+jmp
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 64 a3 01 00 00 00     mov    %eax,%fs:0x1
+   6: 55                   push   %ebp
+   7: 55                   push   %ebp
+   8: 55                   push   %ebp
+   9: 55                   push   %ebp
+   a: 55                   push   %ebp
+   b: 89 e5                 mov    %esp,%ebp
+   d: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  10: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  13: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  16: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  19: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1c: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1f: c3                   ret    
+  20: 55                   push   %ebp
+  21: 55                   push   %ebp
+  22: 64 a3 01 00 00 00     mov    %eax,%fs:0x1
+  28: 89 e5                 mov    %esp,%ebp
+  2a: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  2d: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  30: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  33: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  36: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  39: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  3c: c2 1e 00             ret    \$0x1e
+  3f: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+#pass
diff --git a/gas/testsuite/gas/i386/align-branch-4b.d b/gas/testsuite/gas/i386/align-branch-4b.d
new file mode 100644
index 0000000000..c7690d36aa
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-4b.d
@@ -0,0 +1,36 @@
+#source: align-branch-4.s
+#as: -malign-branch-boundary=32 -malign-branch=ret
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 64 64 a3 01 00 00 00 fs mov %eax,%fs:0x1
+   7: 55                   push   %ebp
+   8: 55                   push   %ebp
+   9: 55                   push   %ebp
+   a: 55                   push   %ebp
+   b: 55                   push   %ebp
+   c: 89 e5                 mov    %esp,%ebp
+   e: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  11: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  14: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  17: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1a: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  1d: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  20: c3                   ret    
+  21: 3e 3e 3e 55           ds ds ds push %ebp
+  25: 55                   push   %ebp
+  26: 64 a3 01 00 00 00     mov    %eax,%fs:0x1
+  2c: 89 e5                 mov    %esp,%ebp
+  2e: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  31: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  34: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  37: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  3a: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  3d: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+  40: c2 1e 00             ret    \$0x1e
+  43: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+#pass
diff --git a/gas/testsuite/gas/i386/align-branch-5.d b/gas/testsuite/gas/i386/align-branch-5.d
new file mode 100644
index 0000000000..1f114272ec
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-5.d
@@ -0,0 +1,36 @@
+#as: -malign-branch-boundary=32 -malign-branch=jcc+fused+jmp
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: c1 e9 02             shr    \$0x2,%ecx
+   3: c1 e9 02             shr    \$0x2,%ecx
+   6: c1 e9 02             shr    \$0x2,%ecx
+   9: 89 d1                 mov    %edx,%ecx
+   b: 31 c0                 xor    %eax,%eax
+   d: c1 e9 02             shr    \$0x2,%ecx
+  10: c1 e9 02             shr    \$0x2,%ecx
+  13: c1 e9 02             shr    \$0x2,%ecx
+  16: c1 e9 02             shr    \$0x2,%ecx
+  19: c1 e9 02             shr    \$0x2,%ecx
+  1c: c1 e9 02             shr    \$0x2,%ecx
+  1f: f6 c2 02             test   \$0x2,%dl
+  22: f3 ab                 rep stos %eax,%es:\(%edi\)
+  24: 75 dd                 jne    3 <foo\+0x3>
+  26: 31 c0                 xor    %eax,%eax
+  28: c1 e9 02             shr    \$0x2,%ecx
+  2b: c1 e9 02             shr    \$0x2,%ecx
+  2e: c1 e9 02             shr    \$0x2,%ecx
+  31: 89 d1                 mov    %edx,%ecx
+  33: 31 c0                 xor    %eax,%eax
+  35: c1 e9 02             shr    \$0x2,%ecx
+  38: c1 e9 02             shr    \$0x2,%ecx
+  3b: c1 e9 02             shr    \$0x2,%ecx
+  3e: f6 c2 02             test   \$0x2,%dl
+  41: e8 [0-9a-f ]+       call   .*
+  46: 75 e3                 jne    2b <foo\+0x2b>
+  48: 31 c0                 xor    %eax,%eax
+#pass
diff --git a/gas/testsuite/gas/i386/align-branch-5.s b/gas/testsuite/gas/i386/align-branch-5.s
new file mode 100644
index 0000000000..58e3b91691
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-5.s
@@ -0,0 +1,32 @@
+ .text
+ .p2align 4,,15
+foo:
+ shrl $2, %ecx
+.L1:
+ shrl $2, %ecx
+ shrl $2, %ecx
+ movl %edx, %ecx
+ xorl %eax, %eax
+ shrl $2, %ecx
+ shrl $2, %ecx
+ shrl $2, %ecx
+ shrl $2, %ecx
+ shrl $2, %ecx
+ shrl $2, %ecx
+ testb $2, %dl
+ rep stosl
+ jne .L1
+ xorl %eax, %eax
+ shrl $2, %ecx
+.L2:
+ shrl $2, %ecx
+ shrl $2, %ecx
+ movl %edx, %ecx
+ xorl %eax, %eax
+ shrl $2, %ecx
+ shrl $2, %ecx
+ shrl $2, %ecx
+ testb $2, %dl
+ call bar
+ jne .L2
+ xorl %eax, %eax
diff --git a/gas/testsuite/gas/i386/align-branch-6.d b/gas/testsuite/gas/i386/align-branch-6.d
new file mode 100644
index 0000000000..29e27878f4
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-6.d
@@ -0,0 +1,22 @@
+#as: -malign-branch-boundary=32 -D
+#objdump: -dw
+#warning_output: align-branch-6.e
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <_start>:
+ +[a-f0-9]+: eb 3c                 jmp    3e <_start\+0x3e>
+ +[a-f0-9]+: 8d b4 26 00 00 00 00 lea    0x0\(%esi,%eiz,1\),%esi
+ +[a-f0-9]+: 8d b4 26 00 00 00 00 lea    0x0\(%esi,%eiz,1\),%esi
+ +[a-f0-9]+: 8d b4 26 00 00 00 00 lea    0x0\(%esi,%eiz,1\),%esi
+ +[a-f0-9]+: 8d b4 26 00 00 00 00 lea    0x0\(%esi,%eiz,1\),%esi
+ +[a-f0-9]+: 8d b4 26 00 00 00 00 lea    0x0\(%esi,%eiz,1\),%esi
+ +[a-f0-9]+: 8d b4 26 00 00 00 00 lea    0x0\(%esi,%eiz,1\),%esi
+ +[a-f0-9]+: 8d b4 26 00 00 00 00 lea    0x0\(%esi,%eiz,1\),%esi
+ +[a-f0-9]+: 8d b4 26 00 00 00 00 lea    0x0\(%esi,%eiz,1\),%esi
+ +[a-f0-9]+: 8d 74 26 00           lea    0x0\(%esi,%eiz,1\),%esi
+ +[a-f0-9]+: f2 73 bf             bnd jae 0 <_start>
+ +[a-f0-9]+: c3                   ret    
+#pass
diff --git a/gas/testsuite/gas/i386/align-branch-6.e b/gas/testsuite/gas/i386/align-branch-6.e
new file mode 100644
index 0000000000..c3378353ef
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-6.e
@@ -0,0 +1,2 @@
+.*: Assembler messages:
+.*:4: Warning: `constant directive` skips -malign-branch-boundary on `jnc`
diff --git a/gas/testsuite/gas/i386/align-branch-6.s b/gas/testsuite/gas/i386/align-branch-6.s
new file mode 100644
index 0000000000..41a92771a2
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-6.s
@@ -0,0 +1,7 @@
+ .text
+_start:
+.L0:
+ .nops 62
+ .byte 0xf2
+ jnc .L0
+ ret
diff --git a/gas/testsuite/gas/i386/align-branch-7.d b/gas/testsuite/gas/i386/align-branch-7.d
new file mode 100644
index 0000000000..7f8c338f16
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-7.d
@@ -0,0 +1,18 @@
+#as: -malign-branch-boundary=32 -malign-branch-prefix-size=4
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+ +[a-f0-9]+: 3e 66 0f 3a 60 00 03 pcmpestrm \$0x3,%ds:\(%eax\),%xmm0
+ +[a-f0-9]+: 3e 3e 89 e5           ds ds mov %esp,%ebp
+ +[a-f0-9]+: 89 bd 1c ff ff ff     mov    %edi,-0xe4\(%ebp\)
+ +[a-f0-9]+: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+ +[a-f0-9]+: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+ +[a-f0-9]+: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+ +[a-f0-9]+: 65 a3 01 00 00 00     mov    %eax,%gs:0x1
+ +[a-f0-9]+: a8 04                 test   \$0x4,%al
+ +[a-f0-9]+: 70 dc                 jo     0 <foo>
+#pass
diff --git a/gas/testsuite/gas/i386/align-branch-7.s b/gas/testsuite/gas/i386/align-branch-7.s
new file mode 100644
index 0000000000..370eedb376
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-7.s
@@ -0,0 +1,14 @@
+  .text
+  .globl  foo
+  .p2align  4
+foo:
+.L1:
+  pcmpestrm $3, (%eax), %xmm0
+  movl  %esp, %ebp
+  movl  %edi, -228(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl %eax, %gs:0x1
+  testb $0x4,%al
+  jo  .L1
diff --git a/gas/testsuite/gas/i386/align-branch-8.d b/gas/testsuite/gas/i386/align-branch-8.d
new file mode 100644
index 0000000000..ee7ae717a3
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-8.d
@@ -0,0 +1,18 @@
+#as: -malign-branch-boundary=32 -malign-branch-prefix-size=4
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+ +[a-f0-9]+: 3e c4 e3 79 60 00 03 vpcmpestrm \$0x3,%ds:\(%eax\),%xmm0
+ +[a-f0-9]+: 3e 3e 89 e5           ds ds mov %esp,%ebp
+ +[a-f0-9]+: 89 bd 1c ff ff ff     mov    %edi,-0xe4\(%ebp\)
+ +[a-f0-9]+: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+ +[a-f0-9]+: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+ +[a-f0-9]+: 89 75 f4             mov    %esi,-0xc\(%ebp\)
+ +[a-f0-9]+: 65 a3 01 00 00 00     mov    %eax,%gs:0x1
+ +[a-f0-9]+: a8 04                 test   \$0x4,%al
+ +[a-f0-9]+: 70 dc                 jo     0 <foo>
+#pass
diff --git a/gas/testsuite/gas/i386/align-branch-8.s b/gas/testsuite/gas/i386/align-branch-8.s
new file mode 100644
index 0000000000..85a7fb6e4b
--- /dev/null
+++ b/gas/testsuite/gas/i386/align-branch-8.s
@@ -0,0 +1,14 @@
+  .text
+  .globl  foo
+  .p2align  4
+foo:
+.L1:
+  vpcmpestrm $3, (%eax), %xmm0
+  movl  %esp, %ebp
+  movl  %edi, -228(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl  %esi, -12(%ebp)
+  movl %eax, %gs:0x1
+  testb $0x4,%al
+  jo  .L1
diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp
index f4c7ce75e9..c31ffab268 100644
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -498,6 +498,24 @@ if [expr ([istarget "i*86-*-*"] ||  [istarget "x86_64-*-*"]) && [gas_32_check]]
     run_list_test "optimize-6a" "-I${srcdir}/$subdir -march=+noavx -al"
     run_dump_test "optimize-6b"
     run_list_test "optimize-7" "-I${srcdir}/$subdir -march=+noavx2 -al"
+    run_dump_test "align-branch-1a"
+    run_dump_test "align-branch-1b"
+    run_dump_test "align-branch-1c"
+    run_dump_test "align-branch-1d"
+    run_dump_test "align-branch-1e"
+    run_dump_test "align-branch-1f"
+    run_dump_test "align-branch-1g"
+    run_dump_test "align-branch-1h"
+    run_dump_test "align-branch-1i"
+    run_dump_test "align-branch-2a"
+    run_dump_test "align-branch-2b"
+    run_dump_test "align-branch-2c"
+    run_dump_test "align-branch-4a"
+    run_dump_test "align-branch-4b"
+    run_dump_test "align-branch-5"
+    run_dump_test "align-branch-6"
+    run_dump_test "align-branch-7"
+    run_dump_test "align-branch-8"
 
     # These tests require support for 8 and 16 bit relocs,
     # so we only run them for ELF and COFF targets.
@@ -573,6 +591,10 @@ if [expr ([istarget "i*86-*-*"] ||  [istarget "x86_64-*-*"]) && [gas_32_check]]
  run_dump_test "property-1"
  run_dump_test "property-2"
 
+ if {[istarget "*-*-linux*"]} then {
+    run_dump_test "align-branch-3"
+ }
+
  if { [gas_64_check] } then {
     run_dump_test "att-regs"
     run_dump_test "intel-regs"
@@ -1032,6 +1054,24 @@ if [expr ([istarget "i*86-*-*"] || [istarget "x86_64-*-*"]) && [gas_64_check]] t
     run_list_test "x86-64-optimize-7a" "-I${srcdir}/$subdir -march=+noavx -al"
     run_dump_test "x86-64-optimize-7b"
     run_list_test "x86-64-optimize-8" "-I${srcdir}/$subdir -march=+noavx2 -al"
+    run_dump_test "x86-64-align-branch-1a"
+    run_dump_test "x86-64-align-branch-1b"
+    run_dump_test "x86-64-align-branch-1c"
+    run_dump_test "x86-64-align-branch-1d"
+    run_dump_test "x86-64-align-branch-1e"
+    run_dump_test "x86-64-align-branch-1f"
+    run_dump_test "x86-64-align-branch-1g"
+    run_dump_test "x86-64-align-branch-1h"
+    run_dump_test "x86-64-align-branch-1i"
+    run_dump_test "x86-64-align-branch-2a"
+    run_dump_test "x86-64-align-branch-2b"
+    run_dump_test "x86-64-align-branch-2c"
+    run_dump_test "x86-64-align-branch-4a"
+    run_dump_test "x86-64-align-branch-4b"
+    run_dump_test "x86-64-align-branch-5"
+    run_dump_test "x86-64-align-branch-6"
+    run_dump_test "x86-64-align-branch-7"
+    run_dump_test "x86-64-align-branch-8"
 
     if { ![istarget "*-*-aix*"]
       && ![istarget "*-*-beos*"]
@@ -1096,6 +1136,11 @@ if [expr ([istarget "i*86-*-*"] || [istarget "x86_64-*-*"]) && [gas_64_check]] t
  run_dump_test "evex-no-scale-64"
  run_dump_test "x86-64-property-1"
  run_dump_test "x86-64-property-2"
+
+ if {[istarget "*-*-linux*"]} then {
+    run_dump_test "x86-64-align-branch-3"
+ }
+
     }
 
     set ASFLAGS "$old_ASFLAGS"
diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-1.s b/gas/testsuite/gas/i386/x86-64-align-branch-1.s
new file mode 100644
index 0000000000..74b3e7a41a
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-align-branch-1.s
@@ -0,0 +1,70 @@
+  .text
+  .globl  foo
+  .p2align  4
+foo:
+  movl  %eax, %fs:0x1
+  pushq  %rbp
+  pushq  %rbp
+  pushq  %rbp
+  movq  %rsp, %rbp
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  cmp  %rax, %rbp
+  je  .L_2
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %edi, -8(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  popq  %rbp
+  popq  %rbp
+  je  .L_2
+  popq  %rbp
+  je  .L_2
+  movl  %eax, -4(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %edi, -8(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  popq  %rbp
+  popq  %rbp
+  jmp  .L_3
+  jmp  .L_3
+  jmp  .L_3
+  movl  %eax, -4(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %edi, -8(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  popq  %rbp
+  popq  %rbp
+  cmp  %rax, %rbp
+  je  .L_2
+  jmp  .L_3
+.L_2:
+  movl  -12(%rbp), %eax
+  movl  %eax, -4(%rbp)
+.L_3:
+  movl  %esi, -1200(%rbp)
+  movl  %esi, -1200(%rbp)
+  movl  %esi, -1200(%rbp)
+  movl  %esi, -1200(%rbp)
+  movl  %esi, -1200(%rbp)
+  movl  %esi, -1200(%rbp)
+  movl  %esi, -1200(%rbp)
+  movl  %esi, -1200(%rbp)
+  movl  %esi, -1200(%rbp)
+  movl  %esi, -1200(%rbp)
+  jmp  .L_3
+  popq  %rbp
+  retq
diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-1a.d b/gas/testsuite/gas/i386/x86-64-align-branch-1a.d
new file mode 100644
index 0000000000..f96808ac21
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-align-branch-1a.d
@@ -0,0 +1,75 @@
+#source: x86-64-align-branch-1.s
+#as: -malign-branch-boundary=32
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 64 64 64 64 89 04 25 01 00 00 00 fs fs fs mov %eax,%fs:0x1
+   b: 55                   push   %rbp
+   c: 55                   push   %rbp
+   d: 55                   push   %rbp
+   e: 48 89 e5             mov    %rsp,%rbp
+  11: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  14: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  17: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1a: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1d: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  20: 48 39 c5             cmp    %rax,%rbp
+  23: 74 5d                 je     82 <foo\+0x82>
+  25: 2e 89 75 f4           mov    %esi,%cs:-0xc\(%rbp\)
+  29: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  2c: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  2f: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  32: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  35: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  38: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3b: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3e: 5d                   pop    %rbp
+  3f: 5d                   pop    %rbp
+  40: 74 40                 je     82 <foo\+0x82>
+  42: 5d                   pop    %rbp
+  43: 74 3d                 je     82 <foo\+0x82>
+  45: 2e 89 45 fc           mov    %eax,%cs:-0x4\(%rbp\)
+  49: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  4c: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  4f: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  52: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  55: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  58: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  5b: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  5e: 5d                   pop    %rbp
+  5f: 5d                   pop    %rbp
+  60: eb 26                 jmp    88 <foo\+0x88>
+  62: eb 24                 jmp    88 <foo\+0x88>
+  64: eb 22                 jmp    88 <foo\+0x88>
+  66: 89 45 fc             mov    %eax,-0x4\(%rbp\)
+  69: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  6c: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  6f: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  72: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  75: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  78: 5d                   pop    %rbp
+  79: 5d                   pop    %rbp
+  7a: 48 39 c5             cmp    %rax,%rbp
+  7d: 74 03                 je     82 <foo\+0x82>
+  7f: 90                   nop
+  80: eb 06                 jmp    88 <foo\+0x88>
+  82: 8b 45 f4             mov    -0xc\(%rbp\),%eax
+  85: 89 45 fc             mov    %eax,-0x4\(%rbp\)
+  88: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  8e: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  94: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  9a: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  a0: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  a6: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  ac: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  b2: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  b8: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  be: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  c4: eb c2                 jmp    88 <foo\+0x88>
+  c6: 5d                   pop    %rbp
+  c7: c3                   retq  
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-1b.d b/gas/testsuite/gas/i386/x86-64-align-branch-1b.d
new file mode 100644
index 0000000000..10b3476796
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-align-branch-1b.d
@@ -0,0 +1,75 @@
+#source: x86-64-align-branch-1.s
+#as: -malign-branch-boundary=32 -malign-branch=fused+jcc+jmp
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 64 64 64 64 89 04 25 01 00 00 00 fs fs fs mov %eax,%fs:0x1
+   b: 55                   push   %rbp
+   c: 55                   push   %rbp
+   d: 55                   push   %rbp
+   e: 48 89 e5             mov    %rsp,%rbp
+  11: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  14: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  17: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1a: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1d: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  20: 48 39 c5             cmp    %rax,%rbp
+  23: 74 5d                 je     82 <foo\+0x82>
+  25: 2e 89 75 f4           mov    %esi,%cs:-0xc\(%rbp\)
+  29: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  2c: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  2f: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  32: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  35: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  38: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3b: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3e: 5d                   pop    %rbp
+  3f: 5d                   pop    %rbp
+  40: 74 40                 je     82 <foo\+0x82>
+  42: 5d                   pop    %rbp
+  43: 74 3d                 je     82 <foo\+0x82>
+  45: 2e 89 45 fc           mov    %eax,%cs:-0x4\(%rbp\)
+  49: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  4c: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  4f: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  52: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  55: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  58: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  5b: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  5e: 5d                   pop    %rbp
+  5f: 5d                   pop    %rbp
+  60: eb 26                 jmp    88 <foo\+0x88>
+  62: eb 24                 jmp    88 <foo\+0x88>
+  64: eb 22                 jmp    88 <foo\+0x88>
+  66: 89 45 fc             mov    %eax,-0x4\(%rbp\)
+  69: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  6c: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  6f: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  72: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  75: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  78: 5d                   pop    %rbp
+  79: 5d                   pop    %rbp
+  7a: 48 39 c5             cmp    %rax,%rbp
+  7d: 74 03                 je     82 <foo\+0x82>
+  7f: 90                   nop
+  80: eb 06                 jmp    88 <foo\+0x88>
+  82: 8b 45 f4             mov    -0xc\(%rbp\),%eax
+  85: 89 45 fc             mov    %eax,-0x4\(%rbp\)
+  88: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  8e: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  94: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  9a: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  a0: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  a6: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  ac: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  b2: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  b8: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  be: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  c4: eb c2                 jmp    88 <foo\+0x88>
+  c6: 5d                   pop    %rbp
+  c7: c3                   retq  
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-1c.d b/gas/testsuite/gas/i386/x86-64-align-branch-1c.d
new file mode 100644
index 0000000000..53c848aed4
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-align-branch-1c.d
@@ -0,0 +1,75 @@
+#source: x86-64-align-branch-1.s
+#as: -malign-branch-boundary=32 -malign-branch-prefix-size=1
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 64 89 04 25 01 00 00 00 mov    %eax,%fs:0x1
+   8: 2e 55                 cs push %rbp
+   a: 2e 55                 cs push %rbp
+   c: 2e 55                 cs push %rbp
+   e: 48 89 e5             mov    %rsp,%rbp
+  11: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  14: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  17: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1a: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1d: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  20: 48 39 c5             cmp    %rax,%rbp
+  23: 74 5d                 je     82 <foo\+0x82>
+  25: 2e 89 75 f4           mov    %esi,%cs:-0xc\(%rbp\)
+  29: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  2c: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  2f: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  32: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  35: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  38: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3b: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3e: 5d                   pop    %rbp
+  3f: 5d                   pop    %rbp
+  40: 74 40                 je     82 <foo\+0x82>
+  42: 5d                   pop    %rbp
+  43: 74 3d                 je     82 <foo\+0x82>
+  45: 2e 89 45 fc           mov    %eax,%cs:-0x4\(%rbp\)
+  49: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  4c: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  4f: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  52: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  55: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  58: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  5b: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  5e: 5d                   pop    %rbp
+  5f: 5d                   pop    %rbp
+  60: eb 26                 jmp    88 <foo\+0x88>
+  62: eb 24                 jmp    88 <foo\+0x88>
+  64: eb 22                 jmp    88 <foo\+0x88>
+  66: 89 45 fc             mov    %eax,-0x4\(%rbp\)
+  69: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  6c: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  6f: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  72: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  75: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  78: 5d                   pop    %rbp
+  79: 5d                   pop    %rbp
+  7a: 48 39 c5             cmp    %rax,%rbp
+  7d: 74 03                 je     82 <foo\+0x82>
+  7f: 90                   nop
+  80: eb 06                 jmp    88 <foo\+0x88>
+  82: 8b 45 f4             mov    -0xc\(%rbp\),%eax
+  85: 89 45 fc             mov    %eax,-0x4\(%rbp\)
+  88: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  8e: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  94: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  9a: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  a0: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  a6: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  ac: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  b2: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  b8: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  be: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  c4: eb c2                 jmp    88 <foo\+0x88>
+  c6: 5d                   pop    %rbp
+  c7: c3                   retq  
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-1d.d b/gas/testsuite/gas/i386/x86-64-align-branch-1d.d
new file mode 100644
index 0000000000..ae6445b29e
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-align-branch-1d.d
@@ -0,0 +1,74 @@
+#source: x86-64-align-branch-1.s
+#as: -malign-branch-boundary=32 -malign-branch=fused+jcc
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 64 64 64 64 89 04 25 01 00 00 00 fs fs fs mov %eax,%fs:0x1
+   b: 55                   push   %rbp
+   c: 55                   push   %rbp
+   d: 55                   push   %rbp
+   e: 48 89 e5             mov    %rsp,%rbp
+  11: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  14: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  17: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1a: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1d: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  20: 48 39 c5             cmp    %rax,%rbp
+  23: 74 5b                 je     80 <foo\+0x80>
+  25: 2e 89 75 f4           mov    %esi,%cs:-0xc\(%rbp\)
+  29: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  2c: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  2f: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  32: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  35: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  38: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3b: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3e: 5d                   pop    %rbp
+  3f: 5d                   pop    %rbp
+  40: 74 3e                 je     80 <foo\+0x80>
+  42: 5d                   pop    %rbp
+  43: 74 3b                 je     80 <foo\+0x80>
+  45: 89 45 fc             mov    %eax,-0x4\(%rbp\)
+  48: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  4b: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  4e: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  51: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  54: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  57: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  5a: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  5d: 5d                   pop    %rbp
+  5e: 5d                   pop    %rbp
+  5f: eb 25                 jmp    86 <foo\+0x86>
+  61: eb 23                 jmp    86 <foo\+0x86>
+  63: eb 21                 jmp    86 <foo\+0x86>
+  65: 89 45 fc             mov    %eax,-0x4\(%rbp\)
+  68: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  6b: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  6e: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  71: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  74: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  77: 5d                   pop    %rbp
+  78: 5d                   pop    %rbp
+  79: 48 39 c5             cmp    %rax,%rbp
+  7c: 74 02                 je     80 <foo\+0x80>
+  7e: eb 06                 jmp    86 <foo\+0x86>
+  80: 8b 45 f4             mov    -0xc\(%rbp\),%eax
+  83: 89 45 fc             mov    %eax,-0x4\(%rbp\)
+  86: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  8c: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  92: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  98: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  9e: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  a4: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  aa: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  b0: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  b6: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  bc: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  c2: eb c2                 jmp    86 <foo\+0x86>
+  c4: 5d                   pop    %rbp
+  c5: c3                   retq  
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-1e.d b/gas/testsuite/gas/i386/x86-64-align-branch-1e.d
new file mode 100644
index 0000000000..beb7744f65
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-align-branch-1e.d
@@ -0,0 +1,74 @@
+#source: x86-64-align-branch-1.s
+#as: -malign-branch-boundary=32 -malign-branch=jcc
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 64 89 04 25 01 00 00 00 mov    %eax,%fs:0x1
+   8: 55                   push   %rbp
+   9: 55                   push   %rbp
+   a: 55                   push   %rbp
+   b: 48 89 e5             mov    %rsp,%rbp
+   e: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  11: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  14: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  17: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1a: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1d: 48 39 c5             cmp    %rax,%rbp
+  20: 74 5b                 je     7d <foo\+0x7d>
+  22: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  25: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  28: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  2b: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  2e: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  31: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  34: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  37: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3a: 5d                   pop    %rbp
+  3b: 5d                   pop    %rbp
+  3c: 74 3f                 je     7d <foo\+0x7d>
+  3e: 2e 5d                 cs pop %rbp
+  40: 74 3b                 je     7d <foo\+0x7d>
+  42: 89 45 fc             mov    %eax,-0x4\(%rbp\)
+  45: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  48: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  4b: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  4e: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  51: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  54: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  57: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  5a: 5d                   pop    %rbp
+  5b: 5d                   pop    %rbp
+  5c: eb 25                 jmp    83 <foo\+0x83>
+  5e: eb 23                 jmp    83 <foo\+0x83>
+  60: eb 21                 jmp    83 <foo\+0x83>
+  62: 89 45 fc             mov    %eax,-0x4\(%rbp\)
+  65: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  68: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  6b: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  6e: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  71: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  74: 5d                   pop    %rbp
+  75: 5d                   pop    %rbp
+  76: 48 39 c5             cmp    %rax,%rbp
+  79: 74 02                 je     7d <foo\+0x7d>
+  7b: eb 06                 jmp    83 <foo\+0x83>
+  7d: 8b 45 f4             mov    -0xc\(%rbp\),%eax
+  80: 89 45 fc             mov    %eax,-0x4\(%rbp\)
+  83: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  89: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  8f: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  95: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  9b: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  a1: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  a7: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  ad: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  b3: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  b9: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  bf: eb c2                 jmp    83 <foo\+0x83>
+  c1: 5d                   pop    %rbp
+  c2: c3                   retq  
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-1f.d b/gas/testsuite/gas/i386/x86-64-align-branch-1f.d
new file mode 100644
index 0000000000..24fbf45eec
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-align-branch-1f.d
@@ -0,0 +1,75 @@
+#source: x86-64-align-branch-1.s
+#as: -malign-branch-boundary=32 -malign-branch=jcc+jmp
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 64 89 04 25 01 00 00 00 mov    %eax,%fs:0x1
+   8: 55                   push   %rbp
+   9: 55                   push   %rbp
+   a: 55                   push   %rbp
+   b: 48 89 e5             mov    %rsp,%rbp
+   e: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  11: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  14: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  17: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1a: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1d: 48 39 c5             cmp    %rax,%rbp
+  20: 74 5d                 je     7f <foo\+0x7f>
+  22: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  25: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  28: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  2b: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  2e: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  31: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  34: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  37: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3a: 5d                   pop    %rbp
+  3b: 5d                   pop    %rbp
+  3c: 74 41                 je     7f <foo\+0x7f>
+  3e: 2e 5d                 cs pop %rbp
+  40: 74 3d                 je     7f <foo\+0x7f>
+  42: 89 45 fc             mov    %eax,-0x4\(%rbp\)
+  45: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  48: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  4b: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  4e: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  51: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  54: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  57: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  5a: 5d                   pop    %rbp
+  5b: 5d                   pop    %rbp
+  5c: eb 27                 jmp    85 <foo\+0x85>
+  5e: 66 90                 xchg   %ax,%ax
+  60: eb 23                 jmp    85 <foo\+0x85>
+  62: eb 21                 jmp    85 <foo\+0x85>
+  64: 89 45 fc             mov    %eax,-0x4\(%rbp\)
+  67: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  6a: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  6d: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  70: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  73: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  76: 5d                   pop    %rbp
+  77: 5d                   pop    %rbp
+  78: 48 39 c5             cmp    %rax,%rbp
+  7b: 74 02                 je     7f <foo\+0x7f>
+  7d: eb 06                 jmp    85 <foo\+0x85>
+  7f: 8b 45 f4             mov    -0xc\(%rbp\),%eax
+  82: 89 45 fc             mov    %eax,-0x4\(%rbp\)
+  85: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  8b: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  91: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  97: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  9d: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  a3: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  a9: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  af: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  b5: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  bb: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  c1: eb c2                 jmp    85 <foo\+0x85>
+  c3: 5d                   pop    %rbp
+  c4: c3                   retq  
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-1g.d b/gas/testsuite/gas/i386/x86-64-align-branch-1g.d
new file mode 100644
index 0000000000..624494064b
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-align-branch-1g.d
@@ -0,0 +1,75 @@
+#source: x86-64-align-branch-1.s
+#as: -mbranches-within-32B-boundaries
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 64 64 64 64 89 04 25 01 00 00 00 fs fs fs mov %eax,%fs:0x1
+   b: 55                   push   %rbp
+   c: 55                   push   %rbp
+   d: 55                   push   %rbp
+   e: 48 89 e5             mov    %rsp,%rbp
+  11: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  14: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  17: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1a: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1d: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  20: 48 39 c5             cmp    %rax,%rbp
+  23: 74 5d                 je     82 <foo\+0x82>
+  25: 2e 89 75 f4           mov    %esi,%cs:-0xc\(%rbp\)
+  29: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  2c: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  2f: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  32: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  35: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  38: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3b: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3e: 5d                   pop    %rbp
+  3f: 5d                   pop    %rbp
+  40: 74 40                 je     82 <foo\+0x82>
+  42: 5d                   pop    %rbp
+  43: 74 3d                 je     82 <foo\+0x82>
+  45: 2e 89 45 fc           mov    %eax,%cs:-0x4\(%rbp\)
+  49: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  4c: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  4f: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  52: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  55: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  58: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  5b: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  5e: 5d                   pop    %rbp
+  5f: 5d                   pop    %rbp
+  60: eb 26                 jmp    88 <foo\+0x88>
+  62: eb 24                 jmp    88 <foo\+0x88>
+  64: eb 22                 jmp    88 <foo\+0x88>
+  66: 89 45 fc             mov    %eax,-0x4\(%rbp\)
+  69: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  6c: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  6f: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  72: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  75: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  78: 5d                   pop    %rbp
+  79: 5d                   pop    %rbp
+  7a: 48 39 c5             cmp    %rax,%rbp
+  7d: 74 03                 je     82 <foo\+0x82>
+  7f: 90                   nop
+  80: eb 06                 jmp    88 <foo\+0x88>
+  82: 8b 45 f4             mov    -0xc\(%rbp\),%eax
+  85: 89 45 fc             mov    %eax,-0x4\(%rbp\)
+  88: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  8e: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  94: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  9a: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  a0: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  a6: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  ac: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  b2: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  b8: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  be: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  c4: eb c2                 jmp    88 <foo\+0x88>
+  c6: 5d                   pop    %rbp
+  c7: c3                   retq  
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-1h.d b/gas/testsuite/gas/i386/x86-64-align-branch-1h.d
new file mode 100644
index 0000000000..a6022be821
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-align-branch-1h.d
@@ -0,0 +1,74 @@
+#source: x86-64-align-branch-1.s
+#as: -mbranches-within-32B-boundaries -malign-branch-boundary=0
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 64 89 04 25 01 00 00 00 mov    %eax,%fs:0x1
+   8: 55                   push   %rbp
+   9: 55                   push   %rbp
+   a: 55                   push   %rbp
+   b: 48 89 e5             mov    %rsp,%rbp
+   e: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  11: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  14: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  17: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1a: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1d: 48 39 c5             cmp    %rax,%rbp
+  20: 74 5a                 je     7c <foo\+0x7c>
+  22: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  25: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  28: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  2b: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  2e: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  31: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  34: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  37: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3a: 5d                   pop    %rbp
+  3b: 5d                   pop    %rbp
+  3c: 74 3e                 je     7c <foo\+0x7c>
+  3e: 5d                   pop    %rbp
+  3f: 74 3b                 je     7c <foo\+0x7c>
+  41: 89 45 fc             mov    %eax,-0x4\(%rbp\)
+  44: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  47: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  4a: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  4d: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  50: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  53: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  56: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  59: 5d                   pop    %rbp
+  5a: 5d                   pop    %rbp
+  5b: eb 25                 jmp    82 <foo\+0x82>
+  5d: eb 23                 jmp    82 <foo\+0x82>
+  5f: eb 21                 jmp    82 <foo\+0x82>
+  61: 89 45 fc             mov    %eax,-0x4\(%rbp\)
+  64: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  67: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  6a: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  6d: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  70: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  73: 5d                   pop    %rbp
+  74: 5d                   pop    %rbp
+  75: 48 39 c5             cmp    %rax,%rbp
+  78: 74 02                 je     7c <foo\+0x7c>
+  7a: eb 06                 jmp    82 <foo\+0x82>
+  7c: 8b 45 f4             mov    -0xc\(%rbp\),%eax
+  7f: 89 45 fc             mov    %eax,-0x4\(%rbp\)
+  82: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  88: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  8e: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  94: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  9a: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  a0: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  a6: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  ac: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  b2: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  b8: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  be: eb c2                 jmp    82 <foo\+0x82>
+  c0: 5d                   pop    %rbp
+  c1: c3                   retq  
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-1i.d b/gas/testsuite/gas/i386/x86-64-align-branch-1i.d
new file mode 100644
index 0000000000..2493626fde
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-align-branch-1i.d
@@ -0,0 +1,78 @@
+#source: x86-64-align-branch-1.s
+#as: -malign-branch-boundary=32 -malign-branch-prefix-size=0
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 64 89 04 25 01 00 00 00 mov    %eax,%fs:0x1
+   8: 55                   push   %rbp
+   9: 55                   push   %rbp
+   a: 55                   push   %rbp
+   b: 48 89 e5             mov    %rsp,%rbp
+   e: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  11: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  14: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  17: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1a: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1d: 0f 1f 00             nopl   \(%rax\)
+  20: 48 39 c5             cmp    %rax,%rbp
+  23: 74 5d                 je     82 <foo\+0x82>
+  25: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  28: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  2b: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  2e: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  31: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  34: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  37: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3a: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3d: 5d                   pop    %rbp
+  3e: 5d                   pop    %rbp
+  3f: 90                   nop
+  40: 74 40                 je     82 <foo\+0x82>
+  42: 5d                   pop    %rbp
+  43: 74 3d                 je     82 <foo\+0x82>
+  45: 89 45 fc             mov    %eax,-0x4\(%rbp\)
+  48: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  4b: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  4e: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  51: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  54: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  57: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  5a: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  5d: 5d                   pop    %rbp
+  5e: 5d                   pop    %rbp
+  5f: 90                   nop
+  60: eb 26                 jmp    88 <foo\+0x88>
+  62: eb 24                 jmp    88 <foo\+0x88>
+  64: eb 22                 jmp    88 <foo\+0x88>
+  66: 89 45 fc             mov    %eax,-0x4\(%rbp\)
+  69: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  6c: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+  6f: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  72: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  75: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  78: 5d                   pop    %rbp
+  79: 5d                   pop    %rbp
+  7a: 48 39 c5             cmp    %rax,%rbp
+  7d: 74 03                 je     82 <foo\+0x82>
+  7f: 90                   nop
+  80: eb 06                 jmp    88 <foo\+0x88>
+  82: 8b 45 f4             mov    -0xc\(%rbp\),%eax
+  85: 89 45 fc             mov    %eax,-0x4\(%rbp\)
+  88: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  8e: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  94: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  9a: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  a0: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  a6: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  ac: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  b2: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  b8: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  be: 89 b5 50 fb ff ff     mov    %esi,-0x4b0\(%rbp\)
+  c4: eb c2                 jmp    88 <foo\+0x88>
+  c6: 5d                   pop    %rbp
+  c7: c3                   retq  
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-2.s b/gas/testsuite/gas/i386/x86-64-align-branch-2.s
new file mode 100644
index 0000000000..54999f85b0
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-align-branch-2.s
@@ -0,0 +1,44 @@
+  .text
+  .globl  foo
+  .p2align  4
+foo:
+  movl  %eax, %fs:0x1
+  pushq  %rbp
+  pushq  %rbp
+  pushq  %rbp
+  pushq  %rbp
+  movq  %rsp, %rbp
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  jmp  *%rax
+  pushq  %rbp
+  pushq  %rbp
+  movl  %eax, %fs:0x1
+  movq  %rsp, %rbp
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  call *%rax
+  movl  %esi, -12(%rbp)
+  pushq  %rbp
+  pushq  %rbp
+  movl  %eax, %fs:0x1
+  movq  %rsp, %rbp
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  call  foo
+  movl  %esi, -12(%rbp)
+  pushq  %rbp
+  pushq  %rbp
+  pushq  %rbp
+  movl  %eax, %fs:0x1
+  movq  %rsp, %rbp
+  movl  %esi, -12(%rbp)
+  call  *foo
+  pushq  %rbp
diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-2a.d b/gas/testsuite/gas/i386/x86-64-align-branch-2a.d
new file mode 100644
index 0000000000..aaf759d42e
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-align-branch-2a.d
@@ -0,0 +1,50 @@
+#source: x86-64-align-branch-2.s
+#as: -malign-branch-boundary=32 -malign-branch=fused+jcc+jmp
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 64 89 04 25 01 00 00 00 mov    %eax,%fs:0x1
+   8: 55                   push   %rbp
+   9: 55                   push   %rbp
+   a: 55                   push   %rbp
+   b: 55                   push   %rbp
+   c: 48 89 e5             mov    %rsp,%rbp
+   f: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  12: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  15: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  18: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1b: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1e: ff e0                 jmpq   \*%rax
+  20: 55                   push   %rbp
+  21: 55                   push   %rbp
+  22: 64 89 04 25 01 00 00 00 mov    %eax,%fs:0x1
+  2a: 48 89 e5             mov    %rsp,%rbp
+  2d: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  30: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  33: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  36: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  39: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3c: ff d0                 callq  \*%rax
+  3e: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  41: 55                   push   %rbp
+  42: 55                   push   %rbp
+  43: 64 89 04 25 01 00 00 00 mov    %eax,%fs:0x1
+  4b: 48 89 e5             mov    %rsp,%rbp
+  4e: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  51: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  54: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  57: e8 [0-9a-f ]+       callq  .*
+  5c: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  5f: 55                   push   %rbp
+  60: 55                   push   %rbp
+  61: 55                   push   %rbp
+  62: 64 89 04 25 01 00 00 00 mov    %eax,%fs:0x1
+  6a: 48 89 e5             mov    %rsp,%rbp
+  6d: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  70: ff 14 25 00 00 00 00 callq  \*0x0
+  77: 55                   push   %rbp
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-2b.d b/gas/testsuite/gas/i386/x86-64-align-branch-2b.d
new file mode 100644
index 0000000000..720868e363
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-align-branch-2b.d
@@ -0,0 +1,50 @@
+#source: x86-64-align-branch-2.s
+#as: -malign-branch-boundary=32 -malign-branch=indirect
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 64 64 64 89 04 25 01 00 00 00 fs fs mov %eax,%fs:0x1
+   a: 55                   push   %rbp
+   b: 55                   push   %rbp
+   c: 55                   push   %rbp
+   d: 55                   push   %rbp
+   e: 48 89 e5             mov    %rsp,%rbp
+  11: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  14: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  17: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1a: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1d: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  20: ff e0                 jmpq   \*%rax
+  22: 2e 2e 55             cs cs push %rbp
+  25: 55                   push   %rbp
+  26: 64 89 04 25 01 00 00 00 mov    %eax,%fs:0x1
+  2e: 48 89 e5             mov    %rsp,%rbp
+  31: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  34: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  37: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3a: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3d: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  40: ff d0                 callq  \*%rax
+  42: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  45: 55                   push   %rbp
+  46: 55                   push   %rbp
+  47: 64 89 04 25 01 00 00 00 mov    %eax,%fs:0x1
+  4f: 48 89 e5             mov    %rsp,%rbp
+  52: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  55: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  58: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  5b: e8 [0-9a-f ]+       callq  .*
+  60: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  63: 55                   push   %rbp
+  64: 55                   push   %rbp
+  65: 55                   push   %rbp
+  66: 64 89 04 25 01 00 00 00 mov    %eax,%fs:0x1
+  6e: 48 89 e5             mov    %rsp,%rbp
+  71: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  74: ff 14 25 00 00 00 00 callq  \*0x0
+  7b: 55                   push   %rbp
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-2c.d b/gas/testsuite/gas/i386/x86-64-align-branch-2c.d
new file mode 100644
index 0000000000..fb87c49cd5
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-align-branch-2c.d
@@ -0,0 +1,50 @@
+#source: x86-64-align-branch-2.s
+#as: -malign-branch-boundary=32 -malign-branch=indirect+call
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 64 64 64 89 04 25 01 00 00 00 fs fs mov %eax,%fs:0x1
+   a: 55                   push   %rbp
+   b: 55                   push   %rbp
+   c: 55                   push   %rbp
+   d: 55                   push   %rbp
+   e: 48 89 e5             mov    %rsp,%rbp
+  11: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  14: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  17: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1a: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1d: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  20: ff e0                 jmpq   \*%rax
+  22: 2e 2e 55             cs cs push %rbp
+  25: 55                   push   %rbp
+  26: 64 89 04 25 01 00 00 00 mov    %eax,%fs:0x1
+  2e: 48 89 e5             mov    %rsp,%rbp
+  31: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  34: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  37: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3a: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3d: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  40: ff d0                 callq  \*%rax
+  42: 2e 2e 2e 2e 2e 89 75 f4 cs cs cs cs mov %esi,%cs:-0xc\(%rbp\)
+  4a: 55                   push   %rbp
+  4b: 55                   push   %rbp
+  4c: 64 89 04 25 01 00 00 00 mov    %eax,%fs:0x1
+  54: 48 89 e5             mov    %rsp,%rbp
+  57: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  5a: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  5d: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  60: e8 [0-9a-f ]+       callq  .*
+  65: 2e 2e 2e 2e 2e 89 75 f4 cs cs cs cs mov %esi,%cs:-0xc\(%rbp\)
+  6d: 2e 2e 55             cs cs push %rbp
+  70: 55                   push   %rbp
+  71: 55                   push   %rbp
+  72: 64 89 04 25 01 00 00 00 mov    %eax,%fs:0x1
+  7a: 48 89 e5             mov    %rsp,%rbp
+  7d: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  80: ff 14 25 00 00 00 00 callq  \*0x0
+  87: 55                   push   %rbp
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-3.d b/gas/testsuite/gas/i386/x86-64-align-branch-3.d
new file mode 100644
index 0000000000..18767a7045
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-align-branch-3.d
@@ -0,0 +1,32 @@
+#as: -malign-branch-boundary=32 -malign-branch=indirect+call
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 64 89 04 25 01 00 00 00 mov    %eax,%fs:0x1
+   8: 55                   push   %rbp
+   9: 55                   push   %rbp
+   a: 55                   push   %rbp
+   b: 55                   push   %rbp
+   c: 48 89 e5             mov    %rsp,%rbp
+   f: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  12: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  15: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  18: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1b: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1e: e8 00 00 00 00       callq  23 <foo\+0x23>
+  23: 55                   push   %rbp
+  24: 55                   push   %rbp
+  25: 64 89 04 25 01 00 00 00 mov    %eax,%fs:0x1
+  2d: 48 89 e5             mov    %rsp,%rbp
+  30: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  33: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  36: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  39: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3c: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3f: ff 15 00 00 00 00     callq  \*0x0\(%rip\)        # 45 <foo\+0x45>
+  45: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-3.s b/gas/testsuite/gas/i386/x86-64-align-branch-3.s
new file mode 100644
index 0000000000..6787cdc36f
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-align-branch-3.s
@@ -0,0 +1,27 @@
+  .text
+  .globl  foo
+  .p2align  4
+foo:
+  movl  %eax, %fs:0x1
+  pushq  %rbp
+  pushq  %rbp
+  pushq  %rbp
+  pushq  %rbp
+  movq  %rsp, %rbp
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  call __tls_get_addr
+  pushq  %rbp
+  pushq  %rbp
+  movl  %eax, %fs:0x1
+  movq  %rsp, %rbp
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  call *__tls_get_addr@GOTPCREL(%rip)
+  movl  %esi, -12(%rbp)
diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-4.s b/gas/testsuite/gas/i386/x86-64-align-branch-4.s
new file mode 100644
index 0000000000..9b546fe189
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-align-branch-4.s
@@ -0,0 +1,27 @@
+  .text
+  .globl  foo
+  .p2align  4
+foo:
+  movl  %eax, %fs:0x1
+  pushq  %rbp
+  pushq  %rbp
+  movq  %rsp, %rbp
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  ret
+  pushq  %rbp
+  movl  %eax, %fs:0x1
+  pushq  %rbp
+  pushq  %rbp
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  ret $30
+  pushq  %rbp
diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-4a.d b/gas/testsuite/gas/i386/x86-64-align-branch-4a.d
new file mode 100644
index 0000000000..47318e832a
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-align-branch-4a.d
@@ -0,0 +1,33 @@
+#source: x86-64-align-branch-4.s
+#as: -malign-branch-boundary=32 -malign-branch=fused+jcc+jmp
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 64 89 04 25 01 00 00 00 mov    %eax,%fs:0x1
+   8: 55                   push   %rbp
+   9: 55                   push   %rbp
+   a: 48 89 e5             mov    %rsp,%rbp
+   d: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  10: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  13: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  16: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  19: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1c: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1f: c3                   retq  
+  20: 55                   push   %rbp
+  21: 64 89 04 25 01 00 00 00 mov    %eax,%fs:0x1
+  29: 55                   push   %rbp
+  2a: 55                   push   %rbp
+  2b: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  2e: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  31: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  34: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  37: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3a: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3d: c2 1e 00             retq   \$0x1e
+  40: 55                   push   %rbp
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-4b.d b/gas/testsuite/gas/i386/x86-64-align-branch-4b.d
new file mode 100644
index 0000000000..9a030dd246
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-align-branch-4b.d
@@ -0,0 +1,33 @@
+#source: x86-64-align-branch-4.s
+#as: -malign-branch-boundary=32 -malign-branch=ret
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: 64 64 89 04 25 01 00 00 00 fs mov %eax,%fs:0x1
+   9: 55                   push   %rbp
+   a: 55                   push   %rbp
+   b: 48 89 e5             mov    %rsp,%rbp
+   e: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  11: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  14: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  17: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1a: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  1d: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  20: c3                   retq  
+  21: 2e 2e 55             cs cs push %rbp
+  24: 64 89 04 25 01 00 00 00 mov    %eax,%fs:0x1
+  2c: 55                   push   %rbp
+  2d: 55                   push   %rbp
+  2e: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  31: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  34: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  37: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3a: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  3d: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+  40: c2 1e 00             retq   \$0x1e
+  43: 55                   push   %rbp
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-5.d b/gas/testsuite/gas/i386/x86-64-align-branch-5.d
new file mode 100644
index 0000000000..3a16c1bef1
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-align-branch-5.d
@@ -0,0 +1,37 @@
+#source: align-branch-5.s
+#as: -malign-branch-boundary=32 -malign-branch=jcc+fused+jmp
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+   0: c1 e9 02             shr    \$0x2,%ecx
+   3: c1 e9 02             shr    \$0x2,%ecx
+   6: c1 e9 02             shr    \$0x2,%ecx
+   9: 89 d1                 mov    %edx,%ecx
+   b: 31 c0                 xor    %eax,%eax
+   d: c1 e9 02             shr    \$0x2,%ecx
+  10: c1 e9 02             shr    \$0x2,%ecx
+  13: c1 e9 02             shr    \$0x2,%ecx
+  16: c1 e9 02             shr    \$0x2,%ecx
+  19: c1 e9 02             shr    \$0x2,%ecx
+  1c: c1 e9 02             shr    \$0x2,%ecx
+  1f: f6 c2 02             test   \$0x2,%dl
+  22: f3 ab                 rep stos %eax,%es:\(%rdi\)
+  24: 75 dd                 jne    3 <foo\+0x3>
+  26: 31 c0                 xor    %eax,%eax
+  28: c1 e9 02             shr    \$0x2,%ecx
+  2b: c1 e9 02             shr    \$0x2,%ecx
+  2e: c1 e9 02             shr    \$0x2,%ecx
+  31: 89 d1                 mov    %edx,%ecx
+  33: 31 c0                 xor    %eax,%eax
+  35: c1 e9 02             shr    \$0x2,%ecx
+  38: c1 e9 02             shr    \$0x2,%ecx
+  3b: c1 e9 02             shr    \$0x2,%ecx
+  3e: f6 c2 02             test   \$0x2,%dl
+  41: e8 00 00 00 00       callq  46 <foo\+0x46>
+  46: 75 e3                 jne    2b <foo\+0x2b>
+  48: 31 c0                 xor    %eax,%eax
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-6.d b/gas/testsuite/gas/i386/x86-64-align-branch-6.d
new file mode 100644
index 0000000000..59a157c809
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-align-branch-6.d
@@ -0,0 +1,19 @@
+#source: align-branch-6.s
+#as: -mbranches-within-32B-boundaries -D
+#objdump: -dw
+#warning_output: align-branch-6.e
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <_start>:
+ +[a-f0-9]+: 66 66 2e 0f 1f 84 00 00 00 00 00 data16 nopw %cs:0x0\(%rax,%rax,1\)
+ +[a-f0-9]+: 66 66 2e 0f 1f 84 00 00 00 00 00 data16 nopw %cs:0x0\(%rax,%rax,1\)
+ +[a-f0-9]+: 66 66 2e 0f 1f 84 00 00 00 00 00 data16 nopw %cs:0x0\(%rax,%rax,1\)
+ +[a-f0-9]+: 66 66 2e 0f 1f 84 00 00 00 00 00 data16 nopw %cs:0x0\(%rax,%rax,1\)
+ +[a-f0-9]+: 66 66 2e 0f 1f 84 00 00 00 00 00 data16 nopw %cs:0x0\(%rax,%rax,1\)
+ +[a-f0-9]+: 0f 1f 80 00 00 00 00 nopl   0x0\(%rax\)
+ +[a-f0-9]+: f2 73 bf             bnd jae 0 <_start>
+ +[a-f0-9]+: c3                   retq  
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-7.d b/gas/testsuite/gas/i386/x86-64-align-branch-7.d
new file mode 100644
index 0000000000..9454d5317e
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-align-branch-7.d
@@ -0,0 +1,18 @@
+#as: -mbranches-within-32B-boundaries -malign-branch-prefix-size=4
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+ +[a-f0-9]+: 2e 66 0f 3a 60 00 03 pcmpestrm \$0x3,%cs:\(%rax\),%xmm0
+ +[a-f0-9]+: 2e 2e 48 89 e5       cs cs mov %rsp,%rbp
+ +[a-f0-9]+: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+ +[a-f0-9]+: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+ +[a-f0-9]+: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+ +[a-f0-9]+: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+ +[a-f0-9]+: 64 89 04 25 01 00 00 00 mov    %eax,%fs:0x1
+ +[a-f0-9]+: a8 04                 test   \$0x4,%al
+ +[a-f0-9]+: 70 dc                 jo     0 <foo>
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-7.s b/gas/testsuite/gas/i386/x86-64-align-branch-7.s
new file mode 100644
index 0000000000..73f58077a1
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-align-branch-7.s
@@ -0,0 +1,14 @@
+  .text
+  .globl  foo
+  .p2align  4
+foo:
+.L1:
+  pcmpestrm $3, (%rax), %xmm0
+  movq  %rsp, %rbp
+  movl  %edi, -8(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl %eax, %fs:0x1
+  testb $0x4,%al
+  jo  .L1
diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-8.d b/gas/testsuite/gas/i386/x86-64-align-branch-8.d
new file mode 100644
index 0000000000..bffabc1d1c
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-align-branch-8.d
@@ -0,0 +1,18 @@
+#as: -mbranches-within-32B-boundaries -malign-branch-prefix-size=4
+#objdump: -dw
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+ +[a-f0-9]+: 2e c4 63 79 60 38 03 vpcmpestrm \$0x3,%cs:\(%rax\),%xmm15
+ +[a-f0-9]+: 2e 2e 48 89 e5       cs cs mov %rsp,%rbp
+ +[a-f0-9]+: 89 7d f8             mov    %edi,-0x8\(%rbp\)
+ +[a-f0-9]+: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+ +[a-f0-9]+: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+ +[a-f0-9]+: 89 75 f4             mov    %esi,-0xc\(%rbp\)
+ +[a-f0-9]+: 64 89 04 25 01 00 00 00 mov    %eax,%fs:0x1
+ +[a-f0-9]+: a8 04                 test   \$0x4,%al
+ +[a-f0-9]+: 70 dc                 jo     0 <foo>
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-8.s b/gas/testsuite/gas/i386/x86-64-align-branch-8.s
new file mode 100644
index 0000000000..bab825c405
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-align-branch-8.s
@@ -0,0 +1,14 @@
+  .text
+  .globl  foo
+  .p2align  4
+foo:
+.L1:
+  vpcmpestrm $3, (%rax), %xmm15
+  movq  %rsp, %rbp
+  movl  %edi, -8(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl  %esi, -12(%rbp)
+  movl %eax, %fs:0x1
+  testb $0x4,%al
+  jo  .L1
diff --git a/ld/testsuite/ld-i386/align-branch-1.d b/ld/testsuite/ld-i386/align-branch-1.d
new file mode 100644
index 0000000000..9eb728728d
--- /dev/null
+++ b/ld/testsuite/ld-i386/align-branch-1.d
@@ -0,0 +1,25 @@
+#as: --32 -mbranches-within-32B-boundaries
+#ld: -melf_i386
+#objdump: -dw
+#notarget: i?86-*-nacl* x86_64-*-nacl*
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+[a-f0-9]+ <_start>:
+ +[a-f0-9]+: 85 d2                 test   %edx,%edx
+ +[a-f0-9]+: 74 20                 je     8049024 <_start\+0x24>
+ +[a-f0-9]+: 85 d2                 test   %edx,%edx
+ +[a-f0-9]+: 74 1c                 je     8049024 <_start\+0x24>
+ +[a-f0-9]+: 85 ff                 test   %edi,%edi
+ +[a-f0-9]+: 74 18                 je     8049024 <_start\+0x24>
+ +[a-f0-9]+: 65 a1 00 00 00 00     mov    %gs:0x0,%eax
+ +[a-f0-9]+: 90                   nop
+ +[a-f0-9]+: 8d 74 26 00           lea    0x0\(%esi,%eiz,1\),%esi
+ +[a-f0-9]+: 3e 3e 3e 8b 90 fc ff ff ff ds ds mov %ds:-0x4\(%eax\),%edx
+ +[a-f0-9]+: 85 d2                 test   %edx,%edx
+ +[a-f0-9]+: 74 00                 je     8049024 <_start\+0x24>
+ +[a-f0-9]+: c3                   ret    
+#pass
diff --git a/ld/testsuite/ld-i386/align-branch-1.s b/ld/testsuite/ld-i386/align-branch-1.s
new file mode 100644
index 0000000000..48edffbe64
--- /dev/null
+++ b/ld/testsuite/ld-i386/align-branch-1.s
@@ -0,0 +1,19 @@
+ .text
+ .globl _start
+_start:
+ testl   %edx, %edx
+ je .L1
+ testl   %edx, %edx
+ je .L1
+ testl   %edi, %edi
+ je .L1
+ leal bar@tlsldm(%ebx), %eax
+ call ___tls_get_addr@PLT
+ movl bar@dtpoff(%eax), %edx
+ testl   %edx, %edx
+ je .L1
+.L1:
+ ret
+ .section ".tdata", "awT", @progbits
+bar:
+ .long 10
diff --git a/ld/testsuite/ld-i386/i386.exp b/ld/testsuite/ld-i386/i386.exp
index 3a1fd8b3cf..8fe047bba0 100644
--- a/ld/testsuite/ld-i386/i386.exp
+++ b/ld/testsuite/ld-i386/i386.exp
@@ -495,6 +495,7 @@ run_dump_test "pr23854"
 run_dump_test "pr23930"
 run_dump_test "pr24322a"
 run_dump_test "pr24322b"
+run_dump_test "align-branch-1"
 
 if { !([istarget "i?86-*-linux*"]
        || [istarget "i?86-*-gnu*"]
diff --git a/ld/testsuite/ld-x86-64/align-branch-1.d b/ld/testsuite/ld-x86-64/align-branch-1.d
new file mode 100644
index 0000000000..85679123d7
--- /dev/null
+++ b/ld/testsuite/ld-x86-64/align-branch-1.d
@@ -0,0 +1,21 @@
+#as: --64 -mbranches-within-32B-boundaries
+#ld: -melf_x86_64
+#objdump: -dw
+#notarget: i?86-*-nacl* x86_64-*-nacl*
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+[a-f0-9]+ <_start>:
+ +[a-f0-9]+: 85 d2                 test   %edx,%edx
+ +[a-f0-9]+: 74 21                 je     401025 <_start\+0x25>
+ +[a-f0-9]+: 48 85 ff             test   %rdi,%rdi
+ +[a-f0-9]+: 74 1c                 je     401025 <_start\+0x25>
+ +[a-f0-9]+: 66 66 66 64 48 8b 04 25 00 00 00 00 data16 data16 data16 mov %fs:0x0,%rax
+ +[a-f0-9]+: 2e 2e 2e 2e 48 8b 98 fc ff ff ff cs cs cs mov %cs:-0x4\(%rax\),%rbx
+ +[a-f0-9]+: 48 85 db             test   %rbx,%rbx
+ +[a-f0-9]+: 74 00                 je     401025 <_start\+0x25>
+ +[a-f0-9]+: c3                   retq  
+#pass
diff --git a/ld/testsuite/ld-x86-64/align-branch-1.s b/ld/testsuite/ld-x86-64/align-branch-1.s
new file mode 100644
index 0000000000..5c60a37e44
--- /dev/null
+++ b/ld/testsuite/ld-x86-64/align-branch-1.s
@@ -0,0 +1,17 @@
+ .text
+ .globl _start
+_start:
+ testl   %edx, %edx
+ je .L1
+ testq   %rdi, %rdi
+ je .L1
+ leaq bar@tlsld(%rip), %rdi
+ call __tls_get_addr@PLT
+ movq bar@dtpoff(%rax), %rbx
+ testq   %rbx, %rbx
+ je .L1
+.L1:
+ ret
+ .section ".tdata", "awT", @progbits
+bar:
+ .long 10
diff --git a/ld/testsuite/ld-x86-64/x86-64.exp b/ld/testsuite/ld-x86-64/x86-64.exp
index b13cc7df0e..ab4822e2b4 100644
--- a/ld/testsuite/ld-x86-64/x86-64.exp
+++ b/ld/testsuite/ld-x86-64/x86-64.exp
@@ -460,6 +460,7 @@ run_dump_test "pr24721"
 run_dump_test "pr24721-x32"
 run_dump_test "pr24905"
 run_dump_test "pr24905-x32"
+run_dump_test "align-branch-1"
 
 if { ![istarget "x86_64-*-linux*"] && ![istarget "x86_64-*-nacl*"]} {
     return
--
2.21.0

Reply | Threaded
Open this post in threaded view
|

Re: V3 [PATCH 2/4] i386: Align branches within a fixed boundary

Jan Beulich-2
In reply to this post by H.J. Lu-30
On 06.12.2019 20:10,  H.J. Lu  wrote:

> Add 3 command-line options to align branches within a fixed boundary
> with segment prefixes or NOPs:
>
> 1. -malign-branch-boundary=NUM aligns branches within NUM byte boundary.
> 2. -malign-branch=TYPE[+TYPE...] specifies types of branches to align.
> The supported branches are:
>   a. Conditional jump.
>   b. Fused conditional jump.
>   c. Unconditional jump.
>   d. Call.
>   e. Ret.
>   f. Indirect jump and call.
> 3. -malign-branch-prefix-size=NUM aligns branches with NUM segment
> prefixes per instruction.
>
> 3 new rs_machine_dependent frag types are added:
>
> 1. BRANCH_PADDING.  The variable size frag to insert NOP before branch.
> 2. BRANCH_PREFIX.  The variable size frag to insert segment prefixes to
> an instruction.  The choices of prefixes are:
>    a. Use the existing segment prefix if there is one.
>    b. Use CS segment prefix in 64-bit mode.
>    c. In 32-bit mode, use SS segment prefix with ESP/EBP base register
>    and use DS segment prefix without ESP/EBP base register.
> 3. FUSED_JCC_PADDING.  The variable size frag to insert NOP before fused
> conditional jump.
>
> The new rs_machine_dependent frags aren't inserted if the previous item
> is a prefix or a constant directive, which may be used to hardcode an
> instruction, since there is no clear instruction boundary.  Segment
> prefixes and NOP padding are disabled before relaxable TLS relocations
> and tls_get_addr calls to keep TLS instruction sequence unchanged.
>
> md_estimate_size_before_relax() and i386_generic_table_relax_frag() are
> used to handled BRANCH_PADDING, BRANCH_PREFIX and FUSED_JCC_PADDING frags.
> i386_generic_table_relax_frag() grows or shrinks sizes of segment prefix
> and NOP to align the next branch frag:
>
> 1. First try to add segment prefixes to instructions before a branch.
> 2. If there is no sufficient room to add segment prefixes, NOP will be
> inserted before a branch.
>
> * config/tc-i386.c (_i386_insn): Add has_gotpc_tls_reloc.
> (tls_get_addr): New.
> (last_insn): New.
> (align_branch_power): New.
> (align_branch_kind): New.
> (align_branch_bit): New.
> (align_branch): New.
> (MAX_FUSED_JCC_PADDING_SIZE): New.
> (align_branch_prefix_size): New.
> (BRANCH_PADDING): New.
> (BRANCH_PREFIX): New.
> (FUSED_JCC_PADDING): New.
> (i386_generate_nops): Support BRANCH_PADDING and FUSED_JCC_PADDING.
> (md_begin): Abort if align_branch_prefix_size <
> MAX_FUSED_JCC_PADDING_SIZE.
> (md_assemble): Set last_insn.
> (maybe_fused_with_jcc_p): New.
> (add_fused_jcc_padding_frag_p): New.
> (add_branch_prefix_frag_p): New.
> (add_branch_padding_frag_p): New.
> (output_insn): Generate a BRANCH_PADDING, FUSED_JCC_PADDING or
> BRANCH_PREFIX frag and terminate each frag to align branches.
> (output_disp): Set i.has_gotpc_tls_reloc to TRUE for GOTPC and
> relaxable TLS relocations.
> (output_imm): Likewise.
> (i386_next_non_empty_frag): New.
> (i386_next_jcc_frag): New.
> (i386_classify_machine_dependent_frag): New.
> (i386_branch_padding_size): New.
> (i386_generic_table_relax_frag): New.
> (md_estimate_size_before_relax): Handle COND_JUMP_PADDING,
> FUSED_JCC_PADDING and COND_JUMP_PREFIX frags.
> (md_convert_frag): Handle BRANCH_PADDING, BRANCH_PREFIX and
> FUSED_JCC_PADDING frags.
> (OPTION_MALIGN_BRANCH_BOUNDARY): New.
> (OPTION_MALIGN_BRANCH_PREFIX_SIZE): New.
> (OPTION_MALIGN_BRANCH): New.
> (md_longopts): Add -malign-branch-boundary=,
> -malign-branch-prefix-size= and -malign-branch=.
> (md_parse_option): Handle -malign-branch-boundary=,
> -malign-branch-prefix-size= and -malign-branch=.
> (md_show_usage): Display -malign-branch-boundary=,
> -malign-branch-prefix-size= and -malign-branch=.
> (i386_target_format): Set tls_get_addr.
> (i386_cons_align): New.
> * config/tc-i386.h (i386_cons_align): New.
> (md_cons_align): New.
> (i386_generic_table_relax_frag): New.
> (md_generic_table_relax_frag): New.
> (i386_tc_frag_data): Add u, padding_address, length,
> max_prefix_length, prefix_length, default_prefix, cmp_size,
> classified and branch_type.
> (TC_FRAG_INIT): Initialize u, padding_address, length,
> max_prefix_length, prefix_length, default_prefix, cmp_size,
> classified and branch_type.
> * doc/c-i386.texi: Document -malign-branch-boundary=,
> -malign-branch= and -malign-branch-prefix-size=.
> ---
>  gas/config/tc-i386.c | 1046 +++++++++++++++++++++++++++++++++++++++++-
>  gas/config/tc-i386.h |   31 ++
>  gas/doc/c-i386.texi  |   26 ++
>  3 files changed, 1100 insertions(+), 3 deletions(-)
>
> diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
> index b62af34268..0ab6651f24 100644
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -368,6 +368,9 @@ struct _i386_insn
>      /* Has ZMM register operands.  */
>      bfd_boolean has_regzmm;
>  
> +    /* Has GOTPC or TLS relocation.  */
> +    bfd_boolean has_gotpc_tls_reloc;
> +
>      /* RM and SIB are the modrm byte and the sib byte where the
>         addressing modes of this insn are encoded.  */
>      modrm_byte rm;
> @@ -562,6 +565,8 @@ static enum flag_code flag_code;
>  static unsigned int object_64bit;
>  static unsigned int disallow_64bit_reloc;
>  static int use_rela_relocations = 0;
> +/* __tls_get_addr/___tls_get_addr symbol for TLS.  */
> +static const char *tls_get_addr;
>  
>  #if ((defined (OBJ_MAYBE_COFF) && defined (OBJ_MAYBE_AOUT)) \
>       || defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF) \
> @@ -622,6 +627,21 @@ static int omit_lock_prefix = 0;
>     "lock addl $0, (%{re}sp)".  */
>  static int avoid_fence = 0;
>  
> +/* Type of the previous instruction.  */
> +static struct
> +  {
> +    segT seg;
> +    const char *file;
> +    const char *name;
> +    unsigned int line;
> +    enum last_insn_kind
> +      {
> + last_insn_other = 0,
> + last_insn_directive,
> + last_insn_prefix
> +      } kind;
> +  } last_insn;
> +
>  /* 1 if the assembler should generate relax relocations.  */
>  
>  static int generate_relax_relocations
> @@ -635,6 +655,44 @@ static enum check_kind
>    }
>  sse_check, operand_check = check_warning;
>  
> +/* Non-zero if branches should be aligned within power of 2 boundary.  */
> +static int align_branch_power = 0;
> +
> +/* Types of branches to align.  */
> +enum align_branch_kind
> +  {
> +    align_branch_none = 0,
> +    align_branch_jcc = 1,
> +    align_branch_fused = 2,
> +    align_branch_jmp = 3,
> +    align_branch_call = 4,
> +    align_branch_indirect = 5,
> +    align_branch_ret = 6
> +  };
> +
> +/* Type bits of branches to align.  */
> +enum align_branch_bit
> +  {
> +    align_branch_jcc_bit = 1 << align_branch_jcc,
> +    align_branch_fused_bit = 1 << align_branch_fused,
> +    align_branch_jmp_bit = 1 << align_branch_jmp,
> +    align_branch_call_bit = 1 << align_branch_call,
> +    align_branch_indirect_bit = 1 << align_branch_indirect,
> +    align_branch_ret_bit = 1 << align_branch_ret
> +  };
> +
> +static unsigned int align_branch = (align_branch_jcc_bit
> +    | align_branch_fused_bit
> +    | align_branch_jmp_bit);
> +
> +/* The maximum padding size for fused jcc.  CMP like instruction can
> +   be 9 bytes and jcc can be 6 bytes.  Leave room just in case for
> +   prefixes.   */
> +#define MAX_FUSED_JCC_PADDING_SIZE 20
> +
> +/* The maximum number of prefixes added for an instruction.  */
> +static unsigned int align_branch_prefix_size = 5;
> +
>  /* Optimization:
>     1. Clear the REX_W bit with register operand if possible.
>     2. Above plus use 128bit vector instruction to clear the full vector
> @@ -738,12 +796,19 @@ int x86_cie_data_alignment;
>  /* Interface to relax_segment.
>     There are 3 major relax states for 386 jump insns because the
>     different types of jumps add different sizes to frags when we're
> -   figuring out what sort of jump to choose to reach a given label.  */
> +   figuring out what sort of jump to choose to reach a given label.
> +
> +   BRANCH_PADDING, BRANCH_PREFIX and FUSED_JCC_PADDING are used to align
> +   branches which are handled by md_estimate_size_before_relax() and
> +   i386_generic_table_relax_frag().  */
>  
>  /* Types.  */
>  #define UNCOND_JUMP 0
>  #define COND_JUMP 1
>  #define COND_JUMP86 2
> +#define BRANCH_PADDING 3
> +#define BRANCH_PREFIX 4
> +#define FUSED_JCC_PADDING 5
>  
>  /* Sizes.  */
>  #define CODE16 1
> @@ -1384,6 +1449,12 @@ i386_generate_nops (fragS *fragP, char *where, offsetT count, int limit)
>      case rs_fill_nop:
>      case rs_align_code:
>        break;
> +    case rs_machine_dependent:
> +      /* Allow NOP padding for jumps and calls.  */
> +      if (TYPE_FROM_RELAX_STATE (fragP->fr_subtype) == BRANCH_PADDING
> +  || TYPE_FROM_RELAX_STATE (fragP->fr_subtype) == FUSED_JCC_PADDING)
> + break;
> +      /* Fall through.  */
>      default:
>        return;
>      }
> @@ -1528,7 +1599,7 @@ i386_generate_nops (fragS *fragP, char *where, offsetT count, int limit)
>    return;
>   }
>      }
> -  else
> +  else if (fragP->fr_type != rs_machine_dependent)
>      fragP->fr_var = count;
>  
>    if ((count / max_single_nop_size) > max_number_of_nops)
> @@ -3011,6 +3082,11 @@ md_begin (void)
>        x86_dwarf2_return_column = 8;
>        x86_cie_data_alignment = -4;
>      }
> +
> +  /* NB: FUSED_JCC_PADDING frag must have sufficient room so that it
> +     can be turned into BRANCH_PREFIX frag.  */
> +  if (align_branch_prefix_size > MAX_FUSED_JCC_PADDING_SIZE)
> +    abort ();
>  }
>  
>  void
> @@ -4536,6 +4612,17 @@ md_assemble (char *line)
>  
>    /* We are ready to output the insn.  */
>    output_insn ();
> +
> +  last_insn.seg = now_seg;
> +
> +  if (i.tm.opcode_modifier.isprefix)
> +    {
> +      last_insn.kind = last_insn_prefix;
> +      last_insn.name = i.tm.name;
> +      last_insn.file = as_where (&last_insn.line);
> +    }
> +  else
> +    last_insn.kind = last_insn_other;
>  }
>  
>  static char *
> @@ -8193,11 +8280,206 @@ encoding_length (const fragS *start_frag, offsetT start_off,
>    return len - start_off + (frag_now_ptr - frag_now->fr_literal);
>  }
>  
> +/* Return 1 for test, and, cmp, add, sub, inc and dec which may
> +   be macro-fused with conditional jumps.  */
> +
> +static int
> +maybe_fused_with_jcc_p (void)
> +{
> +  /* No RIP address.  */
> +  if (i.base_reg && i.base_reg->reg_num == RegIP)
> +    return 0;
> +
> +  /* No VEX/EVEX encoding.  */
> +  if (is_any_vex_encoding (&i.tm))
> +    return 0;
> +
> +  /* and, add, sub with destination register.  */
> +  if ((i.tm.base_opcode >= 0x20 && i.tm.base_opcode <= 0x25)
> +      || i.tm.base_opcode <= 5
> +      || (i.tm.base_opcode >= 0x28 && i.tm.base_opcode <= 0x2d)
> +      || ((i.tm.base_opcode | 3) == 0x83
> +  && ((i.tm.extension_opcode | 1) == 0x5

One more minor suggestion to reduce the number of branches to
result from this code: Just like the "| 1" here, couldn't you

  if (((i.tm.base_opcode | 8) >= 0x28 && (i.tm.base_opcode | 8) <= 0x2d)

> +      || i.tm.extension_opcode == 0x0)))
> +    return (i.types[1].bitfield.class == Reg
> +    || i.types[1].bitfield.instance == Accum);
> +
> +  /* test, cmp with any register.  */
> +  if ((i.tm.base_opcode | 1) == 0x85
> +      || (i.tm.base_opcode | 1) == 0xa9
> +      || ((i.tm.base_opcode | 1) == 0xf7
> +  && i.tm.extension_opcode == 0)
> +      || (i.tm.base_opcode >= 0x38 && i.tm.base_opcode <= 0x3d)
> +      || ((i.tm.base_opcode | 3) == 0x83
> +  && (i.tm.extension_opcode == 0x7)))
> +    return (i.types[0].bitfield.class == Reg
> +    || i.types[0].bitfield.instance == Accum
> +    || i.types[1].bitfield.class == Reg
> +    || i.types[1].bitfield.instance == Accum);
> +
> +  /* inc, dec with any register.   */
> +  if ((i.tm.cpu_flags.bitfield.cpuno64
> +       && (i.tm.base_opcode | 0xf) == 0x4f)
> +      || ((i.tm.base_opcode | 1) == 0xff
> +  && (i.tm.extension_opcode | 1) == 0x1))

Maybe

          && i.tm.extension_opcode <= 0x1)

?

> +/* Return 1 if a BRANCH_PREFIX frag should be generated.  */
> +
> +static int
> +add_branch_prefix_frag_p (void)
> +{
> +  /* NB: Don't work with COND_JUMP86 without i386.  Don't add prefix
> +     to PadLock instructions since they include prefixes in opcode.  */
> +  if (!align_branch_power
> +      || !align_branch_prefix_size
> +      || now_seg == absolute_section
> +      || i.tm.cpu_flags.bitfield.cpupadlock

Didn't you confirm you'd take care of here of other insns than just
the PadLock ones including prefixes in their base_opcode? (I still
don't see what's wrong in this case, as you only mean to add
segment overrides, but that's a separate aspect.)

> @@ -8473,9 +8815,105 @@ output_insn (void)
>    if (j > 15)
>      as_warn (_("instruction length of %u bytes exceeds the limit of 15"),
>       j);
> +  else if (fragP)
> +    {
> +      /* NB: Don't add prefix with GOTPC relocation since
> + output_disp() above depends on the fixed encoding
> + length.  Can't add prefix with TLS relocation since
> + it breaks TLS linker optimization.  */
> +      unsigned int max = i.has_gotpc_tls_reloc ? 0 : 15 - j;
> +      /* Prefix count on the current instruction.  */
> +      unsigned int count = i.vex.length;
> +      unsigned int k;
> +      for (k = 0; k < ARRAY_SIZE (i.prefix); k++)
> + /* REX byte is encoded in VEX/EVEX prefix.  */
> + if (i.prefix[k] && (k != REX_PREFIX || !i.vex.length))
> +  count++;
> +
> +      /* Count SSE prefix.  */
> +      if (!i.vex.length)
> + switch (i.tm.opcode_length)
> +  {
> +  case 3:
> +    if (((i.tm.base_opcode >> 16) & 0xff) == 0xf)
> +      {
> + count++;
> + switch ((i.tm.base_opcode >> 8) & 0xff)
> +  {
> +  case 0x38:
> +  case 0x3a:
> +    count++;
> +    break;
> +  default:
> +    break;
> +  }
> +      }
> +    break;
> +  case 2:
> +    if (((i.tm.base_opcode >> 8) & 0xff) == 0xf)
> +      count++;

In particular (but not only) for this case the "SSE" in the comment
looks sufficiently misleading. How about "extended opcode maps"?

Jan
Reply | Threaded
Open this post in threaded view
|

Re: V3 [PATCH 4/4] i386: Add tests for -malign-branch-boundary and -malign-branch

Egeyar Bagcioglu
In reply to this post by H.J. Lu-30
Hi HJ,

The call instructions in the 2 test cases below are on 32 byte
boundaries while -malign-branch-boundary=32 and
-malign-branch=indirect+call are set. It seems to me that this is
exactly when those test cases should fail. Am I missing something here
or is there a mistake? If the former, can you please explain me why
these are correct?

Thanks
Egeyar


On 12/6/19 8:11 PM, H.J. Lu wrote:

> Add tests for -malign-branch-boundary, -malign-branch and
> -mbranches-within-32B-boundaries.
>
>
> diff --git a/gas/testsuite/gas/i386/align-branch-3.d b/gas/testsuite/gas/i386/align-branch-3.d
> new file mode 100644
> index 0000000000..da31b6f503
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/align-branch-3.d
> @@ -0,0 +1,33 @@
> +#as: -malign-branch-boundary=32 -malign-branch=indirect+call
> +#objdump: -dw
> +
> +.*: +file format .*
> +
> +Disassembly of section .text:
> +
> +0+ <foo>:
> +   0: 64 a3 01 00 00 00     mov    %eax,%fs:0x1
> +   6: 55                   push   %ebp
> +   7: 55                   push   %ebp
> +   8: 55                   push   %ebp
> +   9: 55                   push   %ebp
> +   a: 89 e5                 mov    %esp,%ebp
> +   c: 89 75 f4             mov    %esi,-0xc\(%ebp\)
> +   f: 89 75 f4             mov    %esi,-0xc\(%ebp\)
> +  12: 89 75 f4             mov    %esi,-0xc\(%ebp\)
> +  15: 89 75 f4             mov    %esi,-0xc\(%ebp\)
> +  18: 89 75 f4             mov    %esi,-0xc\(%ebp\)
> +  1b: 89 75 f4             mov    %esi,-0xc\(%ebp\)
> +  1e: e8 fc ff ff ff       call   1f <foo\+0x1f>
> +  23: 55                   push   %ebp
> +  24: 55                   push   %ebp
> +  25: 64 a3 01 00 00 00     mov    %eax,%fs:0x1
> +  2b: 89 e5                 mov    %esp,%ebp
> +  2d: 89 75 f4             mov    %esi,-0xc\(%ebp\)
> +  30: 89 75 f4             mov    %esi,-0xc\(%ebp\)
> +  33: 89 75 f4             mov    %esi,-0xc\(%ebp\)
> +  36: 89 75 f4             mov    %esi,-0xc\(%ebp\)
> +  39: 89 75 f4             mov    %esi,-0xc\(%ebp\)
> +  3c: ff 91 00 00 00 00     call   \*0x0\(%ecx\)
> +  42: 89 75 f4             mov    %esi,-0xc\(%ebp\)
> +#pass
> diff --git a/gas/testsuite/gas/i386/align-branch-3.s b/gas/testsuite/gas/i386/align-branch-3.s
> new file mode 100644
> index 0000000000..e3e6c447c4
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/align-branch-3.s
> @@ -0,0 +1,28 @@
> +  .text
> +  .globl  foo
> +  .p2align  4
> +foo:
> +  movl  %eax, %fs:0x1
> +  pushl  %ebp
> +  pushl  %ebp
> +  pushl  %ebp
> +  pushl  %ebp
> +  movl  %esp, %ebp
> +  movl  %esi, -12(%ebp)
> +  movl  %esi, -12(%ebp)
> +  movl  %esi, -12(%ebp)
> +  movl  %esi, -12(%ebp)
> +  movl  %esi, -12(%ebp)
> +  movl  %esi, -12(%ebp)
> +  call ___tls_get_addr
> +  pushl  %ebp
> +  pushl  %ebp
> +  movl  %eax, %fs:0x1
> +  movl  %esp, %ebp
> +  movl  %esi, -12(%ebp)
> +  movl  %esi, -12(%ebp)
> +  movl  %esi, -12(%ebp)
> +  movl  %esi, -12(%ebp)
> +  movl  %esi, -12(%ebp)
> +  call *___tls_get_addr@GOT(%ecx)
> +  movl  %esi, -12(%ebp)
>
> diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-3.d b/gas/testsuite/gas/i386/x86-64-align-branch-3.d
> new file mode 100644
> index 0000000000..18767a7045
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-align-branch-3.d
> @@ -0,0 +1,32 @@
> +#as: -malign-branch-boundary=32 -malign-branch=indirect+call
> +#objdump: -dw
> +
> +.*: +file format .*
> +
> +Disassembly of section .text:
> +
> +0+ <foo>:
> +   0: 64 89 04 25 01 00 00 00 mov    %eax,%fs:0x1
> +   8: 55                   push   %rbp
> +   9: 55                   push   %rbp
> +   a: 55                   push   %rbp
> +   b: 55                   push   %rbp
> +   c: 48 89 e5             mov    %rsp,%rbp
> +   f: 89 75 f4             mov    %esi,-0xc\(%rbp\)
> +  12: 89 75 f4             mov    %esi,-0xc\(%rbp\)
> +  15: 89 75 f4             mov    %esi,-0xc\(%rbp\)
> +  18: 89 75 f4             mov    %esi,-0xc\(%rbp\)
> +  1b: 89 75 f4             mov    %esi,-0xc\(%rbp\)
> +  1e: e8 00 00 00 00       callq  23 <foo\+0x23>
> +  23: 55                   push   %rbp
> +  24: 55                   push   %rbp
> +  25: 64 89 04 25 01 00 00 00 mov    %eax,%fs:0x1
> +  2d: 48 89 e5             mov    %rsp,%rbp
> +  30: 89 75 f4             mov    %esi,-0xc\(%rbp\)
> +  33: 89 75 f4             mov    %esi,-0xc\(%rbp\)
> +  36: 89 75 f4             mov    %esi,-0xc\(%rbp\)
> +  39: 89 75 f4             mov    %esi,-0xc\(%rbp\)
> +  3c: 89 75 f4             mov    %esi,-0xc\(%rbp\)
> +  3f: ff 15 00 00 00 00     callq  \*0x0\(%rip\)        # 45 <foo\+0x45>
> +  45: 89 75 f4             mov    %esi,-0xc\(%rbp\)
> +#pass
> diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-3.s b/gas/testsuite/gas/i386/x86-64-align-branch-3.s
> new file mode 100644
> index 0000000000..6787cdc36f
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-align-branch-3.s
> @@ -0,0 +1,27 @@
> +  .text
> +  .globl  foo
> +  .p2align  4
> +foo:
> +  movl  %eax, %fs:0x1
> +  pushq  %rbp
> +  pushq  %rbp
> +  pushq  %rbp
> +  pushq  %rbp
> +  movq  %rsp, %rbp
> +  movl  %esi, -12(%rbp)
> +  movl  %esi, -12(%rbp)
> +  movl  %esi, -12(%rbp)
> +  movl  %esi, -12(%rbp)
> +  movl  %esi, -12(%rbp)
> +  call __tls_get_addr
> +  pushq  %rbp
> +  pushq  %rbp
> +  movl  %eax, %fs:0x1
> +  movq  %rsp, %rbp
> +  movl  %esi, -12(%rbp)
> +  movl  %esi, -12(%rbp)
> +  movl  %esi, -12(%rbp)
> +  movl  %esi, -12(%rbp)
> +  movl  %esi, -12(%rbp)
> +  call *__tls_get_addr@GOTPCREL(%rip)
> +  movl  %esi, -12(%rbp)
>

Reply | Threaded
Open this post in threaded view
|

Re: V3 [PATCH 4/4] i386: Add tests for -malign-branch-boundary and -malign-branch

H.J. Lu-30
On Mon, Dec 9, 2019 at 7:29 AM Egeyar Bagcioglu
<[hidden email]> wrote:
>
> Hi HJ,
>
> The call instructions in the 2 test cases below are on 32 byte
> boundaries while -malign-branch-boundary=32 and
> -malign-branch=indirect+call are set. It seems to me that this is
> exactly when those test cases should fail. Am I missing something here
> or is there a mistake? If the former, can you please explain me why
> these are correct?

Since

call ___tls_get_addr
call *___tls_get_addr@GOT(%ecx)

may be changed by linker,  assembler won't align them.

> Thanks
> Egeyar
>
>
> On 12/6/19 8:11 PM, H.J. Lu wrote:
> > Add tests for -malign-branch-boundary, -malign-branch and
> > -mbranches-within-32B-boundaries.
> >
> >
> > diff --git a/gas/testsuite/gas/i386/align-branch-3.d b/gas/testsuite/gas/i386/align-branch-3.d
> > new file mode 100644
> > index 0000000000..da31b6f503
> > --- /dev/null
> > +++ b/gas/testsuite/gas/i386/align-branch-3.d
> > @@ -0,0 +1,33 @@
> > +#as: -malign-branch-boundary=32 -malign-branch=indirect+call
> > +#objdump: -dw
> > +
> > +.*: +file format .*
> > +
> > +Disassembly of section .text:
> > +
> > +0+ <foo>:
> > +   0:        64 a3 01 00 00 00       mov    %eax,%fs:0x1
> > +   6:        55                      push   %ebp
> > +   7:        55                      push   %ebp
> > +   8:        55                      push   %ebp
> > +   9:        55                      push   %ebp
> > +   a:        89 e5                   mov    %esp,%ebp
> > +   c:        89 75 f4                mov    %esi,-0xc\(%ebp\)
> > +   f:        89 75 f4                mov    %esi,-0xc\(%ebp\)
> > +  12:        89 75 f4                mov    %esi,-0xc\(%ebp\)
> > +  15:        89 75 f4                mov    %esi,-0xc\(%ebp\)
> > +  18:        89 75 f4                mov    %esi,-0xc\(%ebp\)
> > +  1b:        89 75 f4                mov    %esi,-0xc\(%ebp\)
> > +  1e:        e8 fc ff ff ff          call   1f <foo\+0x1f>
> > +  23:        55                      push   %ebp
> > +  24:        55                      push   %ebp
> > +  25:        64 a3 01 00 00 00       mov    %eax,%fs:0x1
> > +  2b:        89 e5                   mov    %esp,%ebp
> > +  2d:        89 75 f4                mov    %esi,-0xc\(%ebp\)
> > +  30:        89 75 f4                mov    %esi,-0xc\(%ebp\)
> > +  33:        89 75 f4                mov    %esi,-0xc\(%ebp\)
> > +  36:        89 75 f4                mov    %esi,-0xc\(%ebp\)
> > +  39:        89 75 f4                mov    %esi,-0xc\(%ebp\)
> > +  3c:        ff 91 00 00 00 00       call   \*0x0\(%ecx\)
> > +  42:        89 75 f4                mov    %esi,-0xc\(%ebp\)
> > +#pass
> > diff --git a/gas/testsuite/gas/i386/align-branch-3.s b/gas/testsuite/gas/i386/align-branch-3.s
> > new file mode 100644
> > index 0000000000..e3e6c447c4
> > --- /dev/null
> > +++ b/gas/testsuite/gas/i386/align-branch-3.s
> > @@ -0,0 +1,28 @@
> > +  .text
> > +  .globl  foo
> > +  .p2align  4
> > +foo:
> > +  movl  %eax, %fs:0x1
> > +  pushl  %ebp
> > +  pushl  %ebp
> > +  pushl  %ebp
> > +  pushl  %ebp
> > +  movl  %esp, %ebp
> > +  movl  %esi, -12(%ebp)
> > +  movl  %esi, -12(%ebp)
> > +  movl  %esi, -12(%ebp)
> > +  movl  %esi, -12(%ebp)
> > +  movl  %esi, -12(%ebp)
> > +  movl  %esi, -12(%ebp)
> > +  call       ___tls_get_addr
> > +  pushl  %ebp
> > +  pushl  %ebp
> > +  movl  %eax, %fs:0x1
> > +  movl  %esp, %ebp
> > +  movl  %esi, -12(%ebp)
> > +  movl  %esi, -12(%ebp)
> > +  movl  %esi, -12(%ebp)
> > +  movl  %esi, -12(%ebp)
> > +  movl  %esi, -12(%ebp)
> > +  call *___tls_get_addr@GOT(%ecx)
> > +  movl  %esi, -12(%ebp)
> >
> > diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-3.d b/gas/testsuite/gas/i386/x86-64-align-branch-3.d
> > new file mode 100644
> > index 0000000000..18767a7045
> > --- /dev/null
> > +++ b/gas/testsuite/gas/i386/x86-64-align-branch-3.d
> > @@ -0,0 +1,32 @@
> > +#as: -malign-branch-boundary=32 -malign-branch=indirect+call
> > +#objdump: -dw
> > +
> > +.*: +file format .*
> > +
> > +Disassembly of section .text:
> > +
> > +0+ <foo>:
> > +   0:        64 89 04 25 01 00 00 00         mov    %eax,%fs:0x1
> > +   8:        55                      push   %rbp
> > +   9:        55                      push   %rbp
> > +   a:        55                      push   %rbp
> > +   b:        55                      push   %rbp
> > +   c:        48 89 e5                mov    %rsp,%rbp
> > +   f:        89 75 f4                mov    %esi,-0xc\(%rbp\)
> > +  12:        89 75 f4                mov    %esi,-0xc\(%rbp\)
> > +  15:        89 75 f4                mov    %esi,-0xc\(%rbp\)
> > +  18:        89 75 f4                mov    %esi,-0xc\(%rbp\)
> > +  1b:        89 75 f4                mov    %esi,-0xc\(%rbp\)
> > +  1e:        e8 00 00 00 00          callq  23 <foo\+0x23>
> > +  23:        55                      push   %rbp
> > +  24:        55                      push   %rbp
> > +  25:        64 89 04 25 01 00 00 00         mov    %eax,%fs:0x1
> > +  2d:        48 89 e5                mov    %rsp,%rbp
> > +  30:        89 75 f4                mov    %esi,-0xc\(%rbp\)
> > +  33:        89 75 f4                mov    %esi,-0xc\(%rbp\)
> > +  36:        89 75 f4                mov    %esi,-0xc\(%rbp\)
> > +  39:        89 75 f4                mov    %esi,-0xc\(%rbp\)
> > +  3c:        89 75 f4                mov    %esi,-0xc\(%rbp\)
> > +  3f:        ff 15 00 00 00 00       callq  \*0x0\(%rip\)        # 45 <foo\+0x45>
> > +  45:        89 75 f4                mov    %esi,-0xc\(%rbp\)
> > +#pass
> > diff --git a/gas/testsuite/gas/i386/x86-64-align-branch-3.s b/gas/testsuite/gas/i386/x86-64-align-branch-3.s
> > new file mode 100644
> > index 0000000000..6787cdc36f
> > --- /dev/null
> > +++ b/gas/testsuite/gas/i386/x86-64-align-branch-3.s
> > @@ -0,0 +1,27 @@
> > +  .text
> > +  .globl  foo
> > +  .p2align  4
> > +foo:
> > +  movl  %eax, %fs:0x1
> > +  pushq  %rbp
> > +  pushq  %rbp
> > +  pushq  %rbp
> > +  pushq  %rbp
> > +  movq  %rsp, %rbp
> > +  movl  %esi, -12(%rbp)
> > +  movl  %esi, -12(%rbp)
> > +  movl  %esi, -12(%rbp)
> > +  movl  %esi, -12(%rbp)
> > +  movl  %esi, -12(%rbp)
> > +  call       __tls_get_addr
> > +  pushq  %rbp
> > +  pushq  %rbp
> > +  movl  %eax, %fs:0x1
> > +  movq  %rsp, %rbp
> > +  movl  %esi, -12(%rbp)
> > +  movl  %esi, -12(%rbp)
> > +  movl  %esi, -12(%rbp)
> > +  movl  %esi, -12(%rbp)
> > +  movl  %esi, -12(%rbp)
> > +  call       *__tls_get_addr@GOTPCREL(%rip)
> > +  movl  %esi, -12(%rbp)
> >
>


--
H.J.
Reply | Threaded
Open this post in threaded view
|

Re: V3 [PATCH 2/4] i386: Align branches within a fixed boundary

H.J. Lu-30
In reply to this post by Jan Beulich-2
On Mon, Dec 9, 2019 at 6:54 AM Jan Beulich <[hidden email]> wrote:

>
> On 06.12.2019 20:10,  H.J. Lu  wrote:
> > Add 3 command-line options to align branches within a fixed boundary
> > with segment prefixes or NOPs:
> >
> > 1. -malign-branch-boundary=NUM aligns branches within NUM byte boundary.
> > 2. -malign-branch=TYPE[+TYPE...] specifies types of branches to align.
> > The supported branches are:
> >   a. Conditional jump.
> >   b. Fused conditional jump.
> >   c. Unconditional jump.
> >   d. Call.
> >   e. Ret.
> >   f. Indirect jump and call.
> > 3. -malign-branch-prefix-size=NUM aligns branches with NUM segment
> > prefixes per instruction.
> >
> > 3 new rs_machine_dependent frag types are added:
> >
> > 1. BRANCH_PADDING.  The variable size frag to insert NOP before branch.
> > 2. BRANCH_PREFIX.  The variable size frag to insert segment prefixes to
> > an instruction.  The choices of prefixes are:
> >    a. Use the existing segment prefix if there is one.
> >    b. Use CS segment prefix in 64-bit mode.
> >    c. In 32-bit mode, use SS segment prefix with ESP/EBP base register
> >    and use DS segment prefix without ESP/EBP base register.
> > 3. FUSED_JCC_PADDING.  The variable size frag to insert NOP before fused
> > conditional jump.
> >
> > The new rs_machine_dependent frags aren't inserted if the previous item
> > is a prefix or a constant directive, which may be used to hardcode an
> > instruction, since there is no clear instruction boundary.  Segment
> > prefixes and NOP padding are disabled before relaxable TLS relocations
> > and tls_get_addr calls to keep TLS instruction sequence unchanged.
> >
> > md_estimate_size_before_relax() and i386_generic_table_relax_frag() are
> > used to handled BRANCH_PADDING, BRANCH_PREFIX and FUSED_JCC_PADDING frags.
> > i386_generic_table_relax_frag() grows or shrinks sizes of segment prefix
> > and NOP to align the next branch frag:
> >
> > 1. First try to add segment prefixes to instructions before a branch.
> > 2. If there is no sufficient room to add segment prefixes, NOP will be
> > inserted before a branch.
> >
> >       * config/tc-i386.c (_i386_insn): Add has_gotpc_tls_reloc.
> >       (tls_get_addr): New.
> >       (last_insn): New.
> >       (align_branch_power): New.
> >       (align_branch_kind): New.
> >       (align_branch_bit): New.
> >       (align_branch): New.
> >       (MAX_FUSED_JCC_PADDING_SIZE): New.
> >       (align_branch_prefix_size): New.
> >       (BRANCH_PADDING): New.
> >       (BRANCH_PREFIX): New.
> >       (FUSED_JCC_PADDING): New.
> >       (i386_generate_nops): Support BRANCH_PADDING and FUSED_JCC_PADDING.
> >       (md_begin): Abort if align_branch_prefix_size <
> >       MAX_FUSED_JCC_PADDING_SIZE.
> >       (md_assemble): Set last_insn.
> >       (maybe_fused_with_jcc_p): New.
> >       (add_fused_jcc_padding_frag_p): New.
> >       (add_branch_prefix_frag_p): New.
> >       (add_branch_padding_frag_p): New.
> >       (output_insn): Generate a BRANCH_PADDING, FUSED_JCC_PADDING or
> >       BRANCH_PREFIX frag and terminate each frag to align branches.
> >       (output_disp): Set i.has_gotpc_tls_reloc to TRUE for GOTPC and
> >       relaxable TLS relocations.
> >       (output_imm): Likewise.
> >       (i386_next_non_empty_frag): New.
> >       (i386_next_jcc_frag): New.
> >       (i386_classify_machine_dependent_frag): New.
> >       (i386_branch_padding_size): New.
> >       (i386_generic_table_relax_frag): New.
> >       (md_estimate_size_before_relax): Handle COND_JUMP_PADDING,
> >       FUSED_JCC_PADDING and COND_JUMP_PREFIX frags.
> >       (md_convert_frag): Handle BRANCH_PADDING, BRANCH_PREFIX and
> >       FUSED_JCC_PADDING frags.
> >       (OPTION_MALIGN_BRANCH_BOUNDARY): New.
> >       (OPTION_MALIGN_BRANCH_PREFIX_SIZE): New.
> >       (OPTION_MALIGN_BRANCH): New.
> >       (md_longopts): Add -malign-branch-boundary=,
> >       -malign-branch-prefix-size= and -malign-branch=.
> >       (md_parse_option): Handle -malign-branch-boundary=,
> >       -malign-branch-prefix-size= and -malign-branch=.
> >       (md_show_usage): Display -malign-branch-boundary=,
> >       -malign-branch-prefix-size= and -malign-branch=.
> >       (i386_target_format): Set tls_get_addr.
> >       (i386_cons_align): New.
> >       * config/tc-i386.h (i386_cons_align): New.
> >       (md_cons_align): New.
> >       (i386_generic_table_relax_frag): New.
> >       (md_generic_table_relax_frag): New.
> >       (i386_tc_frag_data): Add u, padding_address, length,
> >       max_prefix_length, prefix_length, default_prefix, cmp_size,
> >       classified and branch_type.
> >       (TC_FRAG_INIT): Initialize u, padding_address, length,
> >       max_prefix_length, prefix_length, default_prefix, cmp_size,
> >       classified and branch_type.
> >       * doc/c-i386.texi: Document -malign-branch-boundary=,
> >       -malign-branch= and -malign-branch-prefix-size=.
> > ---
> >  gas/config/tc-i386.c | 1046 +++++++++++++++++++++++++++++++++++++++++-
> >  gas/config/tc-i386.h |   31 ++
> >  gas/doc/c-i386.texi  |   26 ++
> >  3 files changed, 1100 insertions(+), 3 deletions(-)
> >
> > diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
> > index b62af34268..0ab6651f24 100644
> > --- a/gas/config/tc-i386.c
> > +++ b/gas/config/tc-i386.c
> > @@ -368,6 +368,9 @@ struct _i386_insn
> >      /* Has ZMM register operands.  */
> >      bfd_boolean has_regzmm;
> >
> > +    /* Has GOTPC or TLS relocation.  */
> > +    bfd_boolean has_gotpc_tls_reloc;
> > +
> >      /* RM and SIB are the modrm byte and the sib byte where the
> >         addressing modes of this insn are encoded.  */
> >      modrm_byte rm;
> > @@ -562,6 +565,8 @@ static enum flag_code flag_code;
> >  static unsigned int object_64bit;
> >  static unsigned int disallow_64bit_reloc;
> >  static int use_rela_relocations = 0;
> > +/* __tls_get_addr/___tls_get_addr symbol for TLS.  */
> > +static const char *tls_get_addr;
> >
> >  #if ((defined (OBJ_MAYBE_COFF) && defined (OBJ_MAYBE_AOUT)) \
> >       || defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF) \
> > @@ -622,6 +627,21 @@ static int omit_lock_prefix = 0;
> >     "lock addl $0, (%{re}sp)".  */
> >  static int avoid_fence = 0;
> >
> > +/* Type of the previous instruction.  */
> > +static struct
> > +  {
> > +    segT seg;
> > +    const char *file;
> > +    const char *name;
> > +    unsigned int line;
> > +    enum last_insn_kind
> > +      {
> > +     last_insn_other = 0,
> > +     last_insn_directive,
> > +     last_insn_prefix
> > +      } kind;
> > +  } last_insn;
> > +
> >  /* 1 if the assembler should generate relax relocations.  */
> >
> >  static int generate_relax_relocations
> > @@ -635,6 +655,44 @@ static enum check_kind
> >    }
> >  sse_check, operand_check = check_warning;
> >
> > +/* Non-zero if branches should be aligned within power of 2 boundary.  */
> > +static int align_branch_power = 0;
> > +
> > +/* Types of branches to align.  */
> > +enum align_branch_kind
> > +  {
> > +    align_branch_none = 0,
> > +    align_branch_jcc = 1,
> > +    align_branch_fused = 2,
> > +    align_branch_jmp = 3,
> > +    align_branch_call = 4,
> > +    align_branch_indirect = 5,
> > +    align_branch_ret = 6
> > +  };
> > +
> > +/* Type bits of branches to align.  */
> > +enum align_branch_bit
> > +  {
> > +    align_branch_jcc_bit = 1 << align_branch_jcc,
> > +    align_branch_fused_bit = 1 << align_branch_fused,
> > +    align_branch_jmp_bit = 1 << align_branch_jmp,
> > +    align_branch_call_bit = 1 << align_branch_call,
> > +    align_branch_indirect_bit = 1 << align_branch_indirect,
> > +    align_branch_ret_bit = 1 << align_branch_ret
> > +  };
> > +
> > +static unsigned int align_branch = (align_branch_jcc_bit
> > +                                 | align_branch_fused_bit
> > +                                 | align_branch_jmp_bit);
> > +
> > +/* The maximum padding size for fused jcc.  CMP like instruction can
> > +   be 9 bytes and jcc can be 6 bytes.  Leave room just in case for
> > +   prefixes.   */
> > +#define MAX_FUSED_JCC_PADDING_SIZE 20
> > +
> > +/* The maximum number of prefixes added for an instruction.  */
> > +static unsigned int align_branch_prefix_size = 5;
> > +
> >  /* Optimization:
> >     1. Clear the REX_W bit with register operand if possible.
> >     2. Above plus use 128bit vector instruction to clear the full vector
> > @@ -738,12 +796,19 @@ int x86_cie_data_alignment;
> >  /* Interface to relax_segment.
> >     There are 3 major relax states for 386 jump insns because the
> >     different types of jumps add different sizes to frags when we're
> > -   figuring out what sort of jump to choose to reach a given label.  */
> > +   figuring out what sort of jump to choose to reach a given label.
> > +
> > +   BRANCH_PADDING, BRANCH_PREFIX and FUSED_JCC_PADDING are used to align
> > +   branches which are handled by md_estimate_size_before_relax() and
> > +   i386_generic_table_relax_frag().  */
> >
> >  /* Types.  */
> >  #define UNCOND_JUMP 0
> >  #define COND_JUMP 1
> >  #define COND_JUMP86 2
> > +#define BRANCH_PADDING 3
> > +#define BRANCH_PREFIX 4
> > +#define FUSED_JCC_PADDING 5
> >
> >  /* Sizes.  */
> >  #define CODE16       1
> > @@ -1384,6 +1449,12 @@ i386_generate_nops (fragS *fragP, char *where, offsetT count, int limit)
> >      case rs_fill_nop:
> >      case rs_align_code:
> >        break;
> > +    case rs_machine_dependent:
> > +      /* Allow NOP padding for jumps and calls.  */
> > +      if (TYPE_FROM_RELAX_STATE (fragP->fr_subtype) == BRANCH_PADDING
> > +       || TYPE_FROM_RELAX_STATE (fragP->fr_subtype) == FUSED_JCC_PADDING)
> > +     break;
> > +      /* Fall through.  */
> >      default:
> >        return;
> >      }
> > @@ -1528,7 +1599,7 @@ i386_generate_nops (fragS *fragP, char *where, offsetT count, int limit)
> >         return;
> >       }
> >      }
> > -  else
> > +  else if (fragP->fr_type != rs_machine_dependent)
> >      fragP->fr_var = count;
> >
> >    if ((count / max_single_nop_size) > max_number_of_nops)
> > @@ -3011,6 +3082,11 @@ md_begin (void)
> >        x86_dwarf2_return_column = 8;
> >        x86_cie_data_alignment = -4;
> >      }
> > +
> > +  /* NB: FUSED_JCC_PADDING frag must have sufficient room so that it
> > +     can be turned into BRANCH_PREFIX frag.  */
> > +  if (align_branch_prefix_size > MAX_FUSED_JCC_PADDING_SIZE)
> > +    abort ();
> >  }
> >
> >  void
> > @@ -4536,6 +4612,17 @@ md_assemble (char *line)
> >
> >    /* We are ready to output the insn.  */
> >    output_insn ();
> > +
> > +  last_insn.seg = now_seg;
> > +
> > +  if (i.tm.opcode_modifier.isprefix)
> > +    {
> > +      last_insn.kind = last_insn_prefix;
> > +      last_insn.name = i.tm.name;
> > +      last_insn.file = as_where (&last_insn.line);
> > +    }
> > +  else
> > +    last_insn.kind = last_insn_other;
> >  }
> >
> >  static char *
> > @@ -8193,11 +8280,206 @@ encoding_length (const fragS *start_frag, offsetT start_off,
> >    return len - start_off + (frag_now_ptr - frag_now->fr_literal);
> >  }
> >
> > +/* Return 1 for test, and, cmp, add, sub, inc and dec which may
> > +   be macro-fused with conditional jumps.  */
> > +
> > +static int
> > +maybe_fused_with_jcc_p (void)
> > +{
> > +  /* No RIP address.  */
> > +  if (i.base_reg && i.base_reg->reg_num == RegIP)
> > +    return 0;
> > +
> > +  /* No VEX/EVEX encoding.  */
> > +  if (is_any_vex_encoding (&i.tm))
> > +    return 0;
> > +
> > +  /* and, add, sub with destination register.  */
> > +  if ((i.tm.base_opcode >= 0x20 && i.tm.base_opcode <= 0x25)
> > +      || i.tm.base_opcode <= 5
> > +      || (i.tm.base_opcode >= 0x28 && i.tm.base_opcode <= 0x2d)
> > +      || ((i.tm.base_opcode | 3) == 0x83
> > +       && ((i.tm.extension_opcode | 1) == 0x5
>
> One more minor suggestion to reduce the number of branches to
> result from this code: Just like the "| 1" here, couldn't you
>
>   if (((i.tm.base_opcode | 8) >= 0x28 && (i.tm.base_opcode | 8) <= 0x2d)

We are working on a followup patch to separate and from add/sub.

> > +           || i.tm.extension_opcode == 0x0)))
> > +    return (i.types[1].bitfield.class == Reg
> > +         || i.types[1].bitfield.instance == Accum);
> > +
> > +  /* test, cmp with any register.  */
> > +  if ((i.tm.base_opcode | 1) == 0x85
> > +      || (i.tm.base_opcode | 1) == 0xa9
> > +      || ((i.tm.base_opcode | 1) == 0xf7
> > +       && i.tm.extension_opcode == 0)
> > +      || (i.tm.base_opcode >= 0x38 && i.tm.base_opcode <= 0x3d)
> > +      || ((i.tm.base_opcode | 3) == 0x83
> > +       && (i.tm.extension_opcode == 0x7)))
> > +    return (i.types[0].bitfield.class == Reg
> > +         || i.types[0].bitfield.instance == Accum
> > +         || i.types[1].bitfield.class == Reg
> > +         || i.types[1].bitfield.instance == Accum);
> > +
> > +  /* inc, dec with any register.   */
> > +  if ((i.tm.cpu_flags.bitfield.cpuno64
> > +       && (i.tm.base_opcode | 0xf) == 0x4f)
> > +      || ((i.tm.base_opcode | 1) == 0xff
> > +       && (i.tm.extension_opcode | 1) == 0x1))
>
> Maybe
>
>           && i.tm.extension_opcode <= 0x1)
>
> ?

Will change it.

> > +/* Return 1 if a BRANCH_PREFIX frag should be generated.  */
> > +
> > +static int
> > +add_branch_prefix_frag_p (void)
> > +{
> > +  /* NB: Don't work with COND_JUMP86 without i386.  Don't add prefix
> > +     to PadLock instructions since they include prefixes in opcode.  */
> > +  if (!align_branch_power
> > +      || !align_branch_prefix_size
> > +      || now_seg == absolute_section
> > +      || i.tm.cpu_flags.bitfield.cpupadlock
>
> Didn't you confirm you'd take care of here of other insns than just
> the PadLock ones including prefixes in their base_opcode? (I still

I added the check for i.tm.opcode_length.   PadLock is the only
exception.

> don't see what's wrong in this case, as you only mean to add
> segment overrides, but that's a separate aspect.)
>
> > @@ -8473,9 +8815,105 @@ output_insn (void)
> >         if (j > 15)
> >           as_warn (_("instruction length of %u bytes exceeds the limit of 15"),
> >                    j);
> > +       else if (fragP)
> > +         {
> > +           /* NB: Don't add prefix with GOTPC relocation since
> > +              output_disp() above depends on the fixed encoding
> > +              length.  Can't add prefix with TLS relocation since
> > +              it breaks TLS linker optimization.  */
> > +           unsigned int max = i.has_gotpc_tls_reloc ? 0 : 15 - j;
> > +           /* Prefix count on the current instruction.  */
> > +           unsigned int count = i.vex.length;
> > +           unsigned int k;
> > +           for (k = 0; k < ARRAY_SIZE (i.prefix); k++)
> > +             /* REX byte is encoded in VEX/EVEX prefix.  */
> > +             if (i.prefix[k] && (k != REX_PREFIX || !i.vex.length))
> > +               count++;
> > +
> > +           /* Count SSE prefix.  */
> > +           if (!i.vex.length)
> > +             switch (i.tm.opcode_length)
> > +               {
> > +               case 3:
> > +                 if (((i.tm.base_opcode >> 16) & 0xff) == 0xf)
> > +                   {
> > +                     count++;
> > +                     switch ((i.tm.base_opcode >> 8) & 0xff)
> > +                       {
> > +                       case 0x38:
> > +                       case 0x3a:
> > +                         count++;
> > +                         break;
> > +                       default:
> > +                         break;
> > +                       }
> > +                   }
> > +                 break;
> > +               case 2:
> > +                 if (((i.tm.base_opcode >> 8) & 0xff) == 0xf)
> > +                   count++;
>
> In particular (but not only) for this case the "SSE" in the comment
> looks sufficiently misleading. How about "extended opcode maps"?
>

Will make the change.



--
H.J.
Reply | Threaded
Open this post in threaded view
|

Re: V3 [PATCH 2/4] i386: Align branches within a fixed boundary

Jan Beulich-2
On 10.12.2019 17:19,  H.J. Lu  wrote:

> On Mon, Dec 9, 2019 at 6:54 AM Jan Beulich <[hidden email]> wrote:
>> On 06.12.2019 20:10,  H.J. Lu  wrote:
>>> +/* Return 1 if a BRANCH_PREFIX frag should be generated.  */
>>> +
>>> +static int
>>> +add_branch_prefix_frag_p (void)
>>> +{
>>> +  /* NB: Don't work with COND_JUMP86 without i386.  Don't add prefix
>>> +     to PadLock instructions since they include prefixes in opcode.  */
>>> +  if (!align_branch_power
>>> +      || !align_branch_prefix_size
>>> +      || now_seg == absolute_section
>>> +      || i.tm.cpu_flags.bitfield.cpupadlock
>>
>> Didn't you confirm you'd take care of here of other insns than just
>> the PadLock ones including prefixes in their base_opcode? (I still
>
> I added the check for i.tm.opcode_length.   PadLock is the only
> exception.

I don't see any such check, nor how PadLock would be an exception.
The comment also talks about PadLock only. Please clarify.

Jan
Reply | Threaded
Open this post in threaded view
|

Re: V3 [PATCH 2/4] i386: Align branches within a fixed boundary

H.J. Lu-30
On Tue, Dec 10, 2019 at 8:29 AM Jan Beulich <[hidden email]> wrote:

>
> On 10.12.2019 17:19,  H.J. Lu  wrote:
> > On Mon, Dec 9, 2019 at 6:54 AM Jan Beulich <[hidden email]> wrote:
> >> On 06.12.2019 20:10,  H.J. Lu  wrote:
> >>> +/* Return 1 if a BRANCH_PREFIX frag should be generated.  */
> >>> +
> >>> +static int
> >>> +add_branch_prefix_frag_p (void)
> >>> +{
> >>> +  /* NB: Don't work with COND_JUMP86 without i386.  Don't add prefix
> >>> +     to PadLock instructions since they include prefixes in opcode.  */
> >>> +  if (!align_branch_power
> >>> +      || !align_branch_prefix_size
> >>> +      || now_seg == absolute_section
> >>> +      || i.tm.cpu_flags.bitfield.cpupadlock
> >>
> >> Didn't you confirm you'd take care of here of other insns than just
> >> the PadLock ones including prefixes in their base_opcode? (I still
> >
> > I added the check for i.tm.opcode_length.   PadLock is the only
> > exception.
>
> I don't see any such check, nor how PadLock would be an exception.
> The comment also talks about PadLock only. Please clarify.
>
> Jan

There are

      /* Since the VEX/EVEX prefix contains the implicit prefix, we
         don't need the explicit prefix.  */
      if (!i.tm.opcode_modifier.vex && !i.tm.opcode_modifier.evex)
        {
          switch (i.tm.opcode_length)
            {
            case 3:
              if (i.tm.base_opcode & 0xff000000)
                {
                  prefix = (i.tm.base_opcode >> 24) & 0xff;
                  if (!i.tm.cpu_flags.bitfield.cpupadlock
                      || prefix != REPE_PREFIX_OPCODE
                      || (i.prefix[REP_PREFIX] != REPE_PREFIX_OPCODE))
                    add_prefix (prefix);
                }
              break;

If it isn't a padlock instruction, a prefix will be added.

--
H.J.
Reply | Threaded
Open this post in threaded view
|

Re: V3 [PATCH 2/4] i386: Align branches within a fixed boundary

Jan Beulich-2
On 10.12.2019 18:26, H.J. Lu wrote:

> On Tue, Dec 10, 2019 at 8:29 AM Jan Beulich <[hidden email]> wrote:
>>
>> On 10.12.2019 17:19,  H.J. Lu  wrote:
>>> On Mon, Dec 9, 2019 at 6:54 AM Jan Beulich <[hidden email]> wrote:
>>>> On 06.12.2019 20:10,  H.J. Lu  wrote:
>>>>> +/* Return 1 if a BRANCH_PREFIX frag should be generated.  */
>>>>> +
>>>>> +static int
>>>>> +add_branch_prefix_frag_p (void)
>>>>> +{
>>>>> +  /* NB: Don't work with COND_JUMP86 without i386.  Don't add prefix
>>>>> +     to PadLock instructions since they include prefixes in opcode.  */
>>>>> +  if (!align_branch_power
>>>>> +      || !align_branch_prefix_size
>>>>> +      || now_seg == absolute_section
>>>>> +      || i.tm.cpu_flags.bitfield.cpupadlock
>>>>
>>>> Didn't you confirm you'd take care of here of other insns than just
>>>> the PadLock ones including prefixes in their base_opcode? (I still
>>>
>>> I added the check for i.tm.opcode_length.   PadLock is the only
>>> exception.
>>
>> I don't see any such check, nor how PadLock would be an exception.
>> The comment also talks about PadLock only. Please clarify.
>
> There are
>
>       /* Since the VEX/EVEX prefix contains the implicit prefix, we
>          don't need the explicit prefix.  */
>       if (!i.tm.opcode_modifier.vex && !i.tm.opcode_modifier.evex)
>         {
>           switch (i.tm.opcode_length)
>             {
>             case 3:
>               if (i.tm.base_opcode & 0xff000000)
>                 {
>                   prefix = (i.tm.base_opcode >> 24) & 0xff;
>                   if (!i.tm.cpu_flags.bitfield.cpupadlock
>                       || prefix != REPE_PREFIX_OPCODE
>                       || (i.prefix[REP_PREFIX] != REPE_PREFIX_OPCODE))
>                     add_prefix (prefix);
>                 }
>               break;
>
> If it isn't a padlock instruction, a prefix will be added.

But you've noticed the ||-s in there? Certain prefixes will be added
here even for PadLock insns. Aiui the logic here is to prevent adding
the same prefix twice, as it appears to be a legal way of coding the
insns to explicitly prefix them with REPE. IOW I'm still unconvinced
that your new special case is needed, the more that the logic here is
entirely for REPE, which you don't mean to add anywhere. Could you
perhaps give a concrete example of what would go wrong when this
extra conditional isn't there?

Jan
Reply | Threaded
Open this post in threaded view
|

Re: V3 [PATCH 2/4] i386: Align branches within a fixed boundary

H.J. Lu-30
On Tue, Dec 10, 2019 at 11:52 PM Jan Beulich <[hidden email]> wrote:

>
> On 10.12.2019 18:26, H.J. Lu wrote:
> > On Tue, Dec 10, 2019 at 8:29 AM Jan Beulich <[hidden email]> wrote:
> >>
> >> On 10.12.2019 17:19,  H.J. Lu  wrote:
> >>> On Mon, Dec 9, 2019 at 6:54 AM Jan Beulich <[hidden email]> wrote:
> >>>> On 06.12.2019 20:10,  H.J. Lu  wrote:
> >>>>> +/* Return 1 if a BRANCH_PREFIX frag should be generated.  */
> >>>>> +
> >>>>> +static int
> >>>>> +add_branch_prefix_frag_p (void)
> >>>>> +{
> >>>>> +  /* NB: Don't work with COND_JUMP86 without i386.  Don't add prefix
> >>>>> +     to PadLock instructions since they include prefixes in opcode.  */
> >>>>> +  if (!align_branch_power
> >>>>> +      || !align_branch_prefix_size
> >>>>> +      || now_seg == absolute_section
> >>>>> +      || i.tm.cpu_flags.bitfield.cpupadlock
> >>>>
> >>>> Didn't you confirm you'd take care of here of other insns than just
> >>>> the PadLock ones including prefixes in their base_opcode? (I still
> >>>
> >>> I added the check for i.tm.opcode_length.   PadLock is the only
> >>> exception.
> >>
> >> I don't see any such check, nor how PadLock would be an exception.
> >> The comment also talks about PadLock only. Please clarify.
> >
> > There are
> >
> >       /* Since the VEX/EVEX prefix contains the implicit prefix, we
> >          don't need the explicit prefix.  */
> >       if (!i.tm.opcode_modifier.vex && !i.tm.opcode_modifier.evex)
> >         {
> >           switch (i.tm.opcode_length)
> >             {
> >             case 3:
> >               if (i.tm.base_opcode & 0xff000000)
> >                 {
> >                   prefix = (i.tm.base_opcode >> 24) & 0xff;
> >                   if (!i.tm.cpu_flags.bitfield.cpupadlock
> >                       || prefix != REPE_PREFIX_OPCODE
> >                       || (i.prefix[REP_PREFIX] != REPE_PREFIX_OPCODE))
> >                     add_prefix (prefix);
> >                 }
> >               break;
> >
> > If it isn't a padlock instruction, a prefix will be added.
>
> But you've noticed the ||-s in there? Certain prefixes will be added
> here even for PadLock insns. Aiui the logic here is to prevent adding
> the same prefix twice, as it appears to be a legal way of coding the
> insns to explicitly prefix them with REPE. IOW I'm still unconvinced
> that your new special case is needed, the more that the logic here is
> entirely for REPE, which you don't mean to add anywhere. Could you
> perhaps give a concrete example of what would go wrong when this
> extra conditional isn't there?
>

There are special checks for i.tm.cpu_flags.bitfield.cpupadlock.  I
want to be conservative not to change these instructions.

--
H.J.
Reply | Threaded
Open this post in threaded view
|

Re: V3 [PATCH 4/4] i386: Add tests for -malign-branch-boundary and -malign-branch

Alan Modra-3
In reply to this post by H.J. Lu-30
i686-pc-elf  +FAIL: ld-i386/align-branch-1

i386-darwin  +FAIL: gas/i386/align-branch-1a
i386-darwin  +FAIL: gas/i386/align-branch-1b
i386-darwin  +FAIL: gas/i386/align-branch-1c
i386-darwin  +FAIL: gas/i386/align-branch-1d
i386-darwin  +FAIL: gas/i386/align-branch-1e
i386-darwin  +FAIL: gas/i386/align-branch-1f
i386-darwin  +FAIL: gas/i386/align-branch-1g
i386-darwin  +FAIL: gas/i386/align-branch-1h
i386-darwin  +FAIL: gas/i386/align-branch-1i
i386-darwin  +FAIL: gas/i386/align-branch-5
i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1a
i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1b
i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1c
i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1d
i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1e
i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1f
i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1g
i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1h
i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1i
i386-darwin  +FAIL: gas/i386/x86-64-align-branch-2a
i386-darwin  +FAIL: gas/i386/x86-64-align-branch-2b
i386-darwin  +FAIL: gas/i386/x86-64-align-branch-2c
i386-darwin  +FAIL: gas/i386/x86-64-align-branch-5

--
Alan Modra
Australia Development Lab, IBM
Reply | Threaded
Open this post in threaded view
|

Re: V3 [PATCH 4/4] i386: Add tests for -malign-branch-boundary and -malign-branch

Hongtao Liu
On Mon, Dec 16, 2019 at 10:34 AM Alan Modra <[hidden email]> wrote:

>
> i686-pc-elf  +FAIL: ld-i386/align-branch-1
>
> i386-darwin  +FAIL: gas/i386/align-branch-1a
> i386-darwin  +FAIL: gas/i386/align-branch-1b
> i386-darwin  +FAIL: gas/i386/align-branch-1c
> i386-darwin  +FAIL: gas/i386/align-branch-1d
> i386-darwin  +FAIL: gas/i386/align-branch-1e
> i386-darwin  +FAIL: gas/i386/align-branch-1f
> i386-darwin  +FAIL: gas/i386/align-branch-1g
> i386-darwin  +FAIL: gas/i386/align-branch-1h
> i386-darwin  +FAIL: gas/i386/align-branch-1i
> i386-darwin  +FAIL: gas/i386/align-branch-5
> i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1a
> i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1b
> i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1c
> i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1d
> i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1e
> i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1f
> i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1g
> i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1h
> i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1i
> i386-darwin  +FAIL: gas/i386/x86-64-align-branch-2a
> i386-darwin  +FAIL: gas/i386/x86-64-align-branch-2b
> i386-darwin  +FAIL: gas/i386/x86-64-align-branch-2c
> i386-darwin  +FAIL: gas/i386/x86-64-align-branch-5
>
> --
> Alan Modra
> Australia Development Lab, IBM

I'll take a look.

--
BR,
Hongtao
Reply | Threaded
Open this post in threaded view
|

Re: V3 [PATCH 4/4] i386: Add tests for -malign-branch-boundary and -malign-branch

Hongtao Liu
On Fri, Dec 20, 2019 at 10:56 AM Hongtao Liu <[hidden email]> wrote:
>
> On Mon, Dec 16, 2019 at 10:34 AM Alan Modra <[hidden email]> wrote:
> >
> > i686-pc-elf  +FAIL: ld-i386/align-branch-1
> >

This is because
---cut from gas.log-------
regexp_diff match failure
regexp "^ +[a-f0-9]+:   74 20                   je     8049024 <_start\+0x24>$"
line   " 80480a2:       74 20                   je     80480c4 <_start+0x24>"
----end of cut--------------

> > i386-darwin  +FAIL: gas/i386/align-branch-1a
> > i386-darwin  +FAIL: gas/i386/align-branch-1b
> > i386-darwin  +FAIL: gas/i386/align-branch-1c
> > i386-darwin  +FAIL: gas/i386/align-branch-1d
> > i386-darwin  +FAIL: gas/i386/align-branch-1e
> > i386-darwin  +FAIL: gas/i386/align-branch-1f
> > i386-darwin  +FAIL: gas/i386/align-branch-1g
> > i386-darwin  +FAIL: gas/i386/align-branch-1h
> > i386-darwin  +FAIL: gas/i386/align-branch-1i
> > i386-darwin  +FAIL: gas/i386/align-branch-5
> > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1a
> > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1b
> > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1c
> > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1d
> > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1e
> > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1f
> > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1g
> > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1h
> > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1i
> > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-2a
> > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-2b
> > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-2c
> > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-5
> >
These are because
1.darwin use label to jump, format is not same with orginal *.d file
------cut from gas.log-----------
regexp_diff match failure
regexp "^  22:  74 5e                   je     82 <foo\+0x82>$"
line   "  22:   74 5e                   je     82 <.L_2>"
------end of cut------------------

2.darwin cannot recognize call *foo under 64bit mode
------cut from gas.log-----------
/export/users2/liuhongt/binutils/gas/testsuite/gas/i386/x86-64-align-branch-2.s:
Assembler messages:
/export/users2/liuhongt/binutils/gas/testsuite/gas/i386/x86-64-align-branch-2.s:43:
Error: cannot represent relocation type BFD_RELOC_X86_64_32S
------end of cut-------------------

> > --
> > Alan Modra
> > Australia Development Lab, IBM
>
> I'll take a look.
>
> --
> BR,
> Hongtao

So, update testcase.


--
BR,
Hongtao

0001-fix-for-i386-daerwin-and-i696-pc-elf-testsuite-fail.patch (85K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: V3 [PATCH 4/4] i386: Add tests for -malign-branch-boundary and -malign-branch

H.J. Lu-30
On Tue, Dec 24, 2019 at 11:08 PM Hongtao Liu <[hidden email]> wrote:

>
> On Fri, Dec 20, 2019 at 10:56 AM Hongtao Liu <[hidden email]> wrote:
> >
> > On Mon, Dec 16, 2019 at 10:34 AM Alan Modra <[hidden email]> wrote:
> > >
> > > i686-pc-elf  +FAIL: ld-i386/align-branch-1
> > >
>
> This is because
> ---cut from gas.log-------
> regexp_diff match failure
> regexp "^ +[a-f0-9]+:   74 20                   je     8049024 <_start\+0x24>$"
> line   " 80480a2:       74 20                   je     80480c4 <_start+0x24>"
> ----end of cut--------------
>
> > > i386-darwin  +FAIL: gas/i386/align-branch-1a
> > > i386-darwin  +FAIL: gas/i386/align-branch-1b
> > > i386-darwin  +FAIL: gas/i386/align-branch-1c
> > > i386-darwin  +FAIL: gas/i386/align-branch-1d
> > > i386-darwin  +FAIL: gas/i386/align-branch-1e
> > > i386-darwin  +FAIL: gas/i386/align-branch-1f
> > > i386-darwin  +FAIL: gas/i386/align-branch-1g
> > > i386-darwin  +FAIL: gas/i386/align-branch-1h
> > > i386-darwin  +FAIL: gas/i386/align-branch-1i
> > > i386-darwin  +FAIL: gas/i386/align-branch-5
> > > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1a
> > > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1b
> > > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1c
> > > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1d
> > > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1e
> > > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1f
> > > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1g
> > > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1h
> > > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-1i
> > > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-2a
> > > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-2b
> > > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-2c
> > > i386-darwin  +FAIL: gas/i386/x86-64-align-branch-5
> > >
> These are because
> 1.darwin use label to jump, format is not same with orginal *.d file
> ------cut from gas.log-----------
> regexp_diff match failure
> regexp "^  22:  74 5e                   je     82 <foo\+0x82>$"
> line   "  22:   74 5e                   je     82 <.L_2>"
> ------end of cut------------------
>
> 2.darwin cannot recognize call *foo under 64bit mode
> ------cut from gas.log-----------
> /export/users2/liuhongt/binutils/gas/testsuite/gas/i386/x86-64-align-branch-2.s:
> Assembler messages:
> /export/users2/liuhongt/binutils/gas/testsuite/gas/i386/x86-64-align-branch-2.s:43:
> Error: cannot represent relocation type BFD_RELOC_X86_64_32S
> ------end of cut-------------------
>
> > > --
> > > Alan Modra
> > > Australia Development Lab, IBM
> >
> > I'll take a look.
> >
> > --
> > BR,
> > Hongtao
>
> So, update testcase.
>
This is the patch I am checking in.

Thanks.

--
H.J.

0001-x86-Updated-align-branch-tests-for-Darwin-and-i686-p.patch (87K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: V3 [PATCH 4/4] i386: Add tests for -malign-branch-boundary and -malign-branch

Jan Beulich-2
On 14.01.2020 18:02,  H.J. Lu  wrote:

> --- a/ld/testsuite/ld-i386/align-branch-1.d
> +++ b/ld/testsuite/ld-i386/align-branch-1.d
> @@ -10,16 +10,16 @@ Disassembly of section .text:
>  
>  [a-f0-9]+ <_start>:
>   +[a-f0-9]+: 85 d2                 test   %edx,%edx
> - +[a-f0-9]+: 74 20                 je     8049024 <_start\+0x24>
> + +[a-f0-9]+: 74 20                 je     +[a-f0-9]+ <_start\+0x24>
>   +[a-f0-9]+: 85 d2                 test   %edx,%edx
> - +[a-f0-9]+: 74 1c                 je     8049024 <_start\+0x24>
> + +[a-f0-9]+: 74 1c                 je     +[a-f0-9]+ <_start\+0x24>
>   +[a-f0-9]+: 85 ff                 test   %edi,%edi
> - +[a-f0-9]+: 74 18                 je     8049024 <_start\+0x24>
> + +[a-f0-9]+: 74 18                 je     +[a-f0-9]+ <_start\+0x24>
>   +[a-f0-9]+: 65 a1 00 00 00 00     mov    %gs:0x0,%eax
>   +[a-f0-9]+: 90                   nop
>   +[a-f0-9]+: 8d 74 26 00           lea    0x0\(%esi,%eiz,1\),%esi
>   +[a-f0-9]+: 3e 3e 3e 8b 90 fc ff ff ff ds ds mov %ds:-0x4\(%eax\),%edx
>   +[a-f0-9]+: 85 d2                 test   %edx,%edx
> - +[a-f0-9]+: 74 00                 je     8049024 <_start\+0x24>
> + +[a-f0-9]+: 74 00                 je     +[a-f0-9]+ <_start\+0x24>

Are the + characters before the [ ones really needed here?
They look certainly oddly placed.

Jan
Reply | Threaded
Open this post in threaded view
|

Re: V3 [PATCH 4/4] i386: Add tests for -malign-branch-boundary and -malign-branch

H.J. Lu-30
On Tue, Jan 14, 2020 at 9:08 AM Jan Beulich <[hidden email]> wrote:

>
> On 14.01.2020 18:02,  H.J. Lu  wrote:
> > --- a/ld/testsuite/ld-i386/align-branch-1.d
> > +++ b/ld/testsuite/ld-i386/align-branch-1.d
> > @@ -10,16 +10,16 @@ Disassembly of section .text:
> >
> >  [a-f0-9]+ <_start>:
> >   +[a-f0-9]+: 85 d2                   test   %edx,%edx
> > - +[a-f0-9]+: 74 20                   je     8049024 <_start\+0x24>
> > + +[a-f0-9]+: 74 20                   je     +[a-f0-9]+ <_start\+0x24>
> >   +[a-f0-9]+: 85 d2                   test   %edx,%edx
> > - +[a-f0-9]+: 74 1c                   je     8049024 <_start\+0x24>
> > + +[a-f0-9]+: 74 1c                   je     +[a-f0-9]+ <_start\+0x24>
> >   +[a-f0-9]+: 85 ff                   test   %edi,%edi
> > - +[a-f0-9]+: 74 18                   je     8049024 <_start\+0x24>
> > + +[a-f0-9]+: 74 18                   je     +[a-f0-9]+ <_start\+0x24>
> >   +[a-f0-9]+: 65 a1 00 00 00 00       mov    %gs:0x0,%eax
> >   +[a-f0-9]+: 90                      nop
> >   +[a-f0-9]+: 8d 74 26 00             lea    0x0\(%esi,%eiz,1\),%esi
> >   +[a-f0-9]+: 3e 3e 3e 8b 90 fc ff ff ff      ds ds mov %ds:-0x4\(%eax\),%edx
> >   +[a-f0-9]+: 85 d2                   test   %edx,%edx
> > - +[a-f0-9]+: 74 00                   je     8049024 <_start\+0x24>
> > + +[a-f0-9]+: 74 00                   je     +[a-f0-9]+ <_start\+0x24>
>
> Are the + characters before the [ ones really needed here?
> They look certainly oddly placed.
>

Feel free to remove it.

Thanks.

--
H.J.