[PATCH roland/arm-strlen] Make armv6t2 strlen work in ARM mode too.

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH roland/arm-strlen] Make armv6t2 strlen work in ARM mode too.

Roland McGrath-4
I tested that this has no effect (assembled code wholly unchanged) on
arm-linux-gnueabihf.  I tested that the ARM-mode support actually works by
hacking in "#define NO_THUMB" at the top and verifying no failures from
'make check subdirs=string'.

Incidentally, assembly writers really ought to write more comments!  For
example, I deduced the only plausible reason for using an explicit bne.w
and added a comment about it, but it is exactly the sort of non-obvious
subtle microoptimization that desperately needed clear comments in the
first place.


OK for trunk?


Thanks,
Roland


ports/ChangeLog.arm
2013-08-30  Roland McGrath  <[hidden email]>

        * sysdeps/arm/armv6t2/strlen.S: Include <arm-features.h> first thing.
        [NO_THUMB]: Adapt code for ARM mode.

--- a/ports/sysdeps/arm/armv6t2/strlen.S
+++ b/ports/sysdeps/arm/armv6t2/strlen.S
@@ -21,6 +21,7 @@
 
  */
 
+#include <arm-features.h>               /* This might #define NO_THUMB.  */
 #include <sysdep.h>
 
 #ifdef __ARMEB__
@@ -31,9 +32,24 @@
 #define S2HI lsl
 #endif
 
- /* This code requires Thumb.  */
+#ifndef NO_THUMB
+/* This code is best on Thumb.  */
  .thumb
- .syntax unified
+#else
+/* Using bne.w explicitly is desirable in Thumb mode because it helps
+   align the following label without a nop.  In ARM mode there is no
+   such difference.  */
+.macro bne.w label
+ bne \label
+.endm
+
+/* This clobbers the condition codes, which the real Thumb cbnz instruction
+   does not do.  But it doesn't matter for any of the uses here.  */
+.macro cbnz reg, label
+ cmp \reg, #0
+ bne \label
+.endm
+#endif
 
 /* Parameters and result.  */
 #define srcin r0
@@ -130,9 +146,16 @@ ENTRY(strlen)
  tst tmp1, #4
  pld [src, #64]
  S2HI tmp2, const_m1, tmp2
+#ifdef NO_THUMB
+ mvn tmp1, tmp2
+ orr data1a, data1a, tmp1
+ itt ne
+ orrne data1b, data1b, tmp1
+#else
  orn data1a, data1a, tmp2
  itt ne
  ornne data1b, data1b, tmp2
+#endif
  movne data1a, const_m1
  mov const_0, #0
  b .Lstart_realigned
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH roland/arm-strlen] Make armv6t2 strlen work in ARM mode too.

Joseph Myers
On Fri, 30 Aug 2013, Roland McGrath wrote:

> I tested that this has no effect (assembled code wholly unchanged) on
> arm-linux-gnueabihf.  I tested that the ARM-mode support actually works by
> hacking in "#define NO_THUMB" at the top and verifying no failures from
> 'make check subdirs=string'.
>
> Incidentally, assembly writers really ought to write more comments!  For
> example, I deduced the only plausible reason for using an explicit bne.w
> and added a comment about it, but it is exactly the sort of non-obvious
> subtle microoptimization that desperately needed clear comments in the
> first place.
>
>
> OK for trunk?

OK.

--
Joseph S. Myers
[hidden email]