Re: [discuss] small challenge for instruction selection

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [discuss] small challenge for instruction selection

H.J. Lu-27
On Wed, Jun 15, 2005 at 05:29:07PM -0700, Siddha, Suresh B wrote:

> On Tue, Jun 14, 2005 at 11:15:33PM +0200, Andi Kleen wrote:
> > > movl $0x80706050,0x40302010(%rdi)
> > > ret $0xb0a0
> > >
> > > Is 3 bytes overhead with 8+2 bytes contiguous.
> >
> > Nice. Thanks Zachary. Any other calls? :)
>
> with 2 bytes overhead.
>
> // mov    %eax,0x8070605040302010
>         __asm__ __volatile__ ( ".byte 0xa3; .quad  0x8070605040302010");
>         __asm__ __volatile__ ( "ret $0xa090");
>
> Assembler is not generating the intended code when I use the mnemonic form
> for the first asm stmt. Disassembly is fine though.
>

I opened a bug:

http://sources.redhat.com/bugzilla/show_bug.cgi?id=1013


H.J.
Reply | Threaded
Open this post in threaded view
|

Re: [discuss] small challenge for instruction selection

Jan Beulich
>>> "H. J. Lu" <[hidden email]> 16.06.05 03:11:08 >>>
On Wed, Jun 15, 2005 at 05:29:07PM -0700, Siddha, Suresh B wrote:

> On Tue, Jun 14, 2005 at 11:15:33PM +0200, Andi Kleen wrote:
> > > movl $0x80706050,0x40302010(%rdi)
> > > ret $0xb0a0
> > >
> > > Is 3 bytes overhead with 8+2 bytes contiguous.
> >
> > Nice. Thanks Zachary. Any other calls? :)
>
> with 2 bytes overhead.
>
> // mov    %eax,0x8070605040302010
>         __asm__ __volatile__ ( ".byte 0xa3; .quad  0x8070605040302010");
>         __asm__ __volatile__ ( "ret $0xa090");
>
> Assembler is not generating the intended code when I use the mnemonic form
> for the first asm stmt. Disassembly is fine though.
>
>
>I opened a bug:
>
>http://sources.redhat.com/bugzilla/show_bug.cgi?id=1013 

I don't think that's a bug: include/opcodes/i386.h explicitly disallows this mov form in 64-bit mode; movabs is to be used here instead. This is because for symbolics you'd have an ambiguity resulting otherwise in that you could encode this mov with either 64-bit displacement or sign-extended 32-bit one, with no way for the programmer to indicate which one to choose. Thus you've got to use movabs here to make clear you want a 64-bit disp, and use mov when you rather (and that's very reasonably the default) want a 32-bit one.

Jan

Reply | Threaded
Open this post in threaded view
|

Re: [discuss] small challenge for instruction selection

H.J. Lu-27
On Thu, Jun 16, 2005 at 01:06:13AM -0600, Jan Beulich wrote:

> >>> "H. J. Lu" <[hidden email]> 16.06.05 03:11:08 >>>
> On Wed, Jun 15, 2005 at 05:29:07PM -0700, Siddha, Suresh B wrote:
> > On Tue, Jun 14, 2005 at 11:15:33PM +0200, Andi Kleen wrote:
> > > > movl $0x80706050,0x40302010(%rdi)
> > > > ret $0xb0a0
> > > >
> > > > Is 3 bytes overhead with 8+2 bytes contiguous.
> > >
> > > Nice. Thanks Zachary. Any other calls? :)
> >
> > with 2 bytes overhead.
> >
> > // mov    %eax,0x8070605040302010
> >         __asm__ __volatile__ ( ".byte 0xa3; .quad  0x8070605040302010");
> >         __asm__ __volatile__ ( "ret $0xa090");
> >
> > Assembler is not generating the intended code when I use the mnemonic form
> > for the first asm stmt. Disassembly is fine though.
> >
> >
> >I opened a bug:
> >
> >http://sources.redhat.com/bugzilla/show_bug.cgi?id=1013 
>
> I don't think that's a bug: include/opcodes/i386.h explicitly disallows this mov form in 64-bit mode; movabs is to be used here instead. This is because for symbolics you'd have an ambiguity resulting otherwise in that you could encode this mov with either 64-bit displacement or sign-extended 32-bit one, with no way for the programmer to indicate which one to choose. Thus you've got to use movabs here to make clear you want a 64-bit disp, and use mov when you rather (and that's very reasonably the default) want a 32-bit one.
>

If you look at i386.h closely, there are

/* In the 64bit mode the short form mov immediate is redefined to have
   64bit displacement value.  */
{ "mov",   2,   0xa0, X, CpuNo64,bwl_Suf|D|W,                   { Disp16|Disp32, Acc, 0 } },
{ "mov",   2,   0x88, X, 0,      bwlq_Suf|D|W|Modrm,            { Reg, Reg|AnyMem, 0} },
/* In the 64bit mode the short form mov immediate is redefined to have
   64bit displacement value.  */
{ "mov",   2,   0xb0, X, 0,      bwl_Suf|W|ShortForm,           { EncImm, Reg8|Reg16|Reg32, 0 } },
{ "mov",   2,   0xc6, 0, 0,      bwlq_Suf|W|Modrm,              { EncImm, Reg|AnyMem, 0 } },
{ "mov",   2,   0xb0, X, Cpu64,  q_Suf|W|ShortForm,             { Imm64, Reg64, 0 } },
...
{ "movabs",2,   0xa0, X, Cpu64, bwlq_Suf|D|W,                   { Disp64, Acc, 0 } },
{ "movabs",2,   0xb0, X, Cpu64, q_Suf|W|ShortForm,              { Imm64, Reg64, 0 } },

I think there is an oversight. We have

{ "mov",   2,   0xb0, X, Cpu64,  q_Suf|W|ShortForm,             { Imm64, Reg64, 0 } },
...
{ "movabs",2,   0xb0, X, Cpu64, q_Suf|W|ShortForm,              { Imm64, Reg64, 0 } },

But we just missed

{ "mov",2,   0xa0, X, Cpu64, bwlq_Suf|D|W,                   { Disp64, Acc, 0 } },

I will see what I can do.


H.J.
Reply | Threaded
Open this post in threaded view
|

Re: [discuss] small challenge for instruction selection

Jan Beulich
In reply to this post by H.J. Lu-27
>If you look at i386.h closely, there are
>
>/* In the 64bit mode the short form mov immediate is redefined to have
>   64bit displacement value.  */
>{ "mov",   2,   0xa0, X, CpuNo64,bwl_Suf|D|W,                   { Disp16|Disp32, Acc, 0 } },
>{ "mov",   2,   0x88, X, 0,      bwlq_Suf|D|W|Modrm,            { Reg, Reg|AnyMem, 0} },
>/* In the 64bit mode the short form mov immediate is redefined to have
>   64bit displacement value.  */
>{ "mov",   2,   0xb0, X, 0,      bwl_Suf|W|ShortForm,           { EncImm, Reg8|Reg16|Reg32, 0 } },
>{ "mov",   2,   0xc6, 0, 0,      bwlq_Suf|W|Modrm,              { EncImm, Reg|AnyMem, 0 } },
>{ "mov",   2,   0xb0, X, Cpu64,  q_Suf|W|ShortForm,             { Imm64, Reg64, 0 } },
>...
>{ "movabs",2,   0xa0, X, Cpu64, bwlq_Suf|D|W,                   { Disp64, Acc, 0 } },
>{ "movabs",2,   0xb0, X, Cpu64, q_Suf|W|ShortForm,              { Imm64, Reg64, 0 } },
>
>I think there is an oversight. We have
>
>{ "mov",   2,   0xb0, X, Cpu64,  q_Suf|W|ShortForm,             { Imm64, Reg64, 0 } },
>...
>{ "movabs",2,   0xb0, X, Cpu64, q_Suf|W|ShortForm,              { Imm64, Reg64, 0 } },
>
>But we just missed
>
>{ "mov",2,   0xa0, X, Cpu64, bwlq_Suf|D|W,                   { Disp64, Acc, 0 } },
>
>I will see what I can do.

Then it could as well be

{ "mov",   2,   0xa0, X, 0,bwl_Suf|D|W,                   { Disp16|Disp32|Disp64, Acc, 0 } }

at the top of the table. But as I tried to outline before, that'd (depending on its placement) either hide or be hidden by

{ "mov",   2,   0x88, X, 0,      bwlq_Suf|D|W|Modrm,            { Reg, Reg|AnyMem, 0} }

resulting in either the same behavior as now or all mov to/from the accumulator (and without base and/or index) to be performed with a 64-bit displacement, which needlessly increases code size for the common case.

A couple of years back I already tried to do what you're trying now, but had to give up for the reasons outlined. It would be very nice if you can make it work somehow without ill side effects...

Jan