hppa build broken

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

hppa build broken

Steve Ellcey

I was wondering if someone could check in this patch.  Without it, I
cannot build the latest binutils on the hppa SOM target.  It looks like
som.c was missed when Fred Fish added NAME##_find_inliner_info.

I verified that I could build again on hppa1.1-hp-hpux11.00 after adding
this patch.

Steve Ellcey
[hidden email]


bfd/Changelog

2005-06-14  Steve Ellcey  <[hidden email]>

        * som.c (som_find_inliner_info): New.


*** src.orig/bfd/som.c Tue Jun 14 08:23:14 2005
--- src/bfd/som.c Tue Jun 14 08:22:57 2005
*************** som_bfd_link_split_section (bfd *abfd AT
*** 6246,6251 ****
--- 6246,6252 ----
  #define som_bfd_merge_private_bfd_data _bfd_generic_bfd_merge_private_bfd_data
  #define som_bfd_copy_private_header_data _bfd_generic_bfd_copy_private_header_data
  #define som_bfd_set_private_flags _bfd_generic_bfd_set_private_flags
+ #define som_find_inliner_info _bfd_nosymbols_find_inliner_info
 
  const bfd_target som_vec =
  {
Reply | Threaded
Open this post in threaded view
|

Re: hppa build broken

John David Anglin-4
> 2005-06-14  Steve Ellcey  <[hidden email]>
>
> * som.c (som_find_inliner_info): New.

Ok.  At the moment, I don't see a reason to provide a som
implementation.

Dave
--
J. David Anglin                                  [hidden email]
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)
Reply | Threaded
Open this post in threaded view
|

Re: hppa build broken

John David Anglin-4
In reply to this post by Steve Ellcey
> 2005-06-14  Steve Ellcey  <[hidden email]>
>
> * som.c (som_find_inliner_info): New.

I've installed the change.

On a different subject, the recent changes to fix the handling
of .block appear to have exposed a hpux problem.  The block1.s
hppa/parse test takes about 15 minutes real time on my A550.  The
user and system times are only a few seconds.  It's a bit strange
but the test doesn't timeout, apparently because the assembler
process spends all its time sleeping.  Under linux, the real time
is a few seconds.

Dave
--
J. David Anglin                                  [hidden email]
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)
Reply | Threaded
Open this post in threaded view
|

Re: hppa build broken

Alan Modra
On Tue, Jun 14, 2005 at 08:44:32PM -0400, John David Anglin wrote:
> On a different subject, the recent changes to fix the handling
> of .block appear to have exposed a hpux problem.  The block1.s
> hppa/parse test takes about 15 minutes real time on my A550.  The
> user and system times are only a few seconds.  It's a bit strange
> but the test doesn't timeout, apparently because the assembler
> process spends all its time sleeping.  Under linux, the real time
> is a few seconds.

You might like to see whether increasing the buffer size at write.c:1146
helps poor HPUX file I/O.  4k would probably be a better size.

--
Alan Modra
IBM OzLabs - Linux Technology Centre
Reply | Threaded
Open this post in threaded view
|

Re: hppa build broken

John David Anglin-4
> You might like to see whether increasing the buffer size at write.c:1146
> helps poor HPUX file I/O.  4k would probably be a better size.

Doesn't help:

4k:
real    16m13.524s
user    0m1.250s
sys     0m21.310s

64k:
real    16m18.884s
user    0m0.230s
sys     0m21.430s

vmstat indicates ~ 300 xfer/sec for the drive.  At 4k per transfer,
that's about 1000s or 16m.  Why linux would be so much faster isn't
clear although the drive and interface used is faster.

I'd also previously tried changing pa_block to output 8MB blocks
followed by the residual (i.e., using the old way but limiting
memory usage to 8MB).  While the user and sys times went down, the
real time actually got worse.  The old code suffers from the same
problem.

Dave
--
J. David Anglin                                  [hidden email]
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)
Reply | Threaded
Open this post in threaded view
|

Re: hppa build broken

Steve Ellcey
Dave,

How about this patch to pa_block.  It seems to work well for me
and causes no regressions.  I got an XPASS on hppa/parse/block1.s.  The
bigger I make BFRAG_SIZE, the faster it runs, but the more memory it
takes.  On my HP-UX PA workstation I could run block1.s in about 2.5
minutes with the patch vs. 30+ minutes without the patch.

Steve Ellcey
[hidden email]


gas/Changelog

2005-06-15  Steve Ellcey  <[hidden email]>

        * config/tc-hppa.c (pa_block): Use bigger blocks to write zero.


*** src.orig/gas/config/tc-hppa.c Wed Jun 15 11:52:46 2005
--- src/gas/config/tc-hppa.c Wed Jun 15 13:40:08 2005
*************** pa_align (bytes)
*** 5933,5938 ****
--- 5933,5940 ----
 
  /* Handle a .BLOCK type pseudo-op.  */
 
+ #define BFRAG_SIZE (1024*1024)
+
  static void
  pa_block (z)
       int z ATTRIBUTE_UNUSED;
*************** pa_block (z)
*** 5954,5961 ****
    else
      {
        /* Always fill with zeros, that's what the HP assembler does.  */
!       char *p = frag_var (rs_fill, 1, 1, 0, NULL, temp_size, NULL);
!       *p = 0;
      }
 
    pa_undefine_label ();
--- 5956,5967 ----
    else
      {
        /* Always fill with zeros, that's what the HP assembler does.  */
!       int bcount = temp_size / BFRAG_SIZE;
!       int extra = temp_size - (bcount * BFRAG_SIZE);
!       char *p = frag_var (rs_fill, BFRAG_SIZE, BFRAG_SIZE, 0, NULL, bcount, NULL);
!       memset (p, 0, BFRAG_SIZE);
!       p = frag_var (rs_fill, extra, extra, 0, NULL, 1, NULL);
!       memset (p, 0, extra);
      }
 
    pa_undefine_label ();
Reply | Threaded
Open this post in threaded view
|

Re: hppa build broken

John David Anglin-4
> How about this patch to pa_block.  It seems to work well for me
> and causes no regressions.  I got an XPASS on hppa/parse/block1.s.  The
> bigger I make BFRAG_SIZE, the faster it runs, but the more memory it
> takes.  On my HP-UX PA workstation I could run block1.s in about 2.5
> minutes with the patch vs. 30+ minutes without the patch.

That's definitely a nice improvement in speed.  I think allocating
1 MB is ok on today's machines.

> !       int bcount = temp_size / BFRAG_SIZE;
> !       int extra = temp_size - (bcount * BFRAG_SIZE);
> !       char *p = frag_var (rs_fill, BFRAG_SIZE, BFRAG_SIZE, 0, NULL, bcount, NULL);
> !       memset (p, 0, BFRAG_SIZE);
> !       p = frag_var (rs_fill, extra, extra, 0, NULL, 1, NULL);
> !       memset (p, 0, extra);

I'm not convinced that this is correct.  Both bcount and extra could
be zero.  0 is the default value for .block.  Thus, I don't think
the first frag_var should be created when bcount is zero.  Also, bcount
and extra probably should be unsigned int's.

Dave
--
J. David Anglin                                  [hidden email]
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)
Reply | Threaded
Open this post in threaded view
|

Re: hppa build broken

Steve Ellcey
> I'm not convinced that this is correct.  Both bcount and extra could
> be zero.  0 is the default value for .block.  Thus, I don't think
> the first frag_var should be created when bcount is zero.  Also, bcount
> and extra probably should be unsigned int's.

Hm, didn't think about a size of zero.  Here is an updated patch.  I am
not sure if I need to call frag_var at all if I have a temp_size of
zero, but I put it in.  The regression tests worked with or without the
check for "(bcount == 0 && extra == 0)" but it seemed safer to put it
in and duplicate the existing behavour.

Steve Ellcey
[hidden email]


gas/Changelog

2005-06-15  Steve Ellcey  <[hidden email]>

        * config/tc-hppa.c (pa_block): Use bigger blocks to write zero.


*** src.orig/gas/config/tc-hppa.c Wed Jun 15 11:52:46 2005
--- src/gas/config/tc-hppa.c Wed Jun 15 15:43:44 2005
*************** pa_align (bytes)
*** 5933,5938 ****
--- 5933,5940 ----
 
  /* Handle a .BLOCK type pseudo-op.  */
 
+ #define BFRAG_SIZE (1024*1024)
+
  static void
  pa_block (z)
       int z ATTRIBUTE_UNUSED;
*************** pa_block (z)
*** 5954,5961 ****
    else
      {
        /* Always fill with zeros, that's what the HP assembler does.  */
!       char *p = frag_var (rs_fill, 1, 1, 0, NULL, temp_size, NULL);
!       *p = 0;
      }
 
    pa_undefine_label ();
--- 5956,5978 ----
    else
      {
        /* Always fill with zeros, that's what the HP assembler does.  */
!       unsigned int bcount = temp_size / BFRAG_SIZE;
!       unsigned int extra = temp_size - (bcount * BFRAG_SIZE);
!       if (bcount > 0)
! {
!  char *p = frag_var (rs_fill, BFRAG_SIZE, BFRAG_SIZE, 0, NULL, bcount, NULL);
!  memset (p, 0, BFRAG_SIZE);
! }
!       if (extra > 0)
! {
!  char *p = frag_var (rs_fill, extra, extra, 0, NULL, 1, NULL);
!  memset (p, 0, extra);
! }
!       if (bcount == 0 && extra == 0)
! {
!  char *p = frag_var (rs_fill, 1, 1, 0, NULL, 0, NULL);
!  memset (p, 0, 1);
! }
      }
 
    pa_undefine_label ();
Reply | Threaded
Open this post in threaded view
|

Re: hppa build broken

John David Anglin-4
> !       if (extra > 0)

I believe that the above can be changed to

        if (extra > 0 || bcount == 0)

> ! {
> !  char *p = frag_var (rs_fill, extra, extra, 0, NULL, 1, NULL);
> !  memset (p, 0, extra);
> ! }

and the following code deleted.

> !       if (bcount == 0 && extra == 0)
> ! {
> !  char *p = frag_var (rs_fill, 1, 1, 0, NULL, 0, NULL);
> !  memset (p, 0, 1);
> ! }

It's ok to call memset with extra == 0.  This should duplicate the
behavior prior to Alan's change.  Need to check that

        .block 0

actually produces a 0 sized block.  I'll try this when I get home
tonight.

Dave
--
J. David Anglin                                  [hidden email]
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)
Reply | Threaded
Open this post in threaded view
|

Re: hppa build broken

John David Anglin-4
In reply to this post by Steve Ellcey
Steve,

> Hm, didn't think about a size of zero.  Here is an updated patch.  I am
> not sure if I need to call frag_var at all if I have a temp_size of
> zero, but I put it in.  The regression tests worked with or without the
> check for "(bcount == 0 && extra == 0)" but it seemed safer to put it
> in and duplicate the existing behavour.

Does this help with performance on your workstation?  I've checked
that it does the right thing but it it doesn't seem to help with
performance on my A550.  I'm updating the installed SCSI patches
but I'm not too hopeful that it will help.  I tried two different
drives using different controllers.

Dave
--
J. David Anglin                                  [hidden email]
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

Index: config/tc-hppa.c
===================================================================
RCS file: /cvs/src/src/gas/config/tc-hppa.c,v
retrieving revision 1.121
diff -u -3 -p -r1.121 tc-hppa.c
--- config/tc-hppa.c 10 Jun 2005 05:46:48 -0000 1.121
+++ config/tc-hppa.c 16 Jun 2005 02:14:47 -0000
@@ -5933,6 +5933,8 @@ pa_align (bytes)
 
 /* Handle a .BLOCK type pseudo-op.  */
 
+#define BFRAG_SIZE (1024*1024)
+
 static void
 pa_block (z)
      int z ATTRIBUTE_UNUSED;
@@ -5953,9 +5955,20 @@ pa_block (z)
     }
   else
     {
+      unsigned int bcount = temp_size / BFRAG_SIZE;
+      unsigned int extra = temp_size - (bcount * BFRAG_SIZE);
+
       /* Always fill with zeros, that's what the HP assembler does.  */
-      char *p = frag_var (rs_fill, 1, 1, 0, NULL, temp_size, NULL);
-      *p = 0;
+      if (bcount > 0)
+ {
+  char *p = frag_var (rs_fill, BFRAG_SIZE, BFRAG_SIZE, 0, NULL, bcount, NULL);
+  memset (p, 0, BFRAG_SIZE);
+ }
+      if (extra > 0 || bcount == 0)
+ {
+  char *p = frag_var (rs_fill, extra, extra, 0, NULL, 1, NULL);
+  memset (p, 0, extra);
+ }
     }
 
   pa_undefine_label ();
Reply | Threaded
Open this post in threaded view
|

Re: hppa build broken

John David Anglin-4
In reply to this post by Steve Ellcey
> Does this help with performance on your workstation?  I've checked
> that it does the right thing but it it doesn't seem to help with
> performance on my A550.

There are still problems with the algorithm.  The enclosed little
test program writes junk in just over a minute:

real    1m3.716s
user    0m0.010s
sys     0m6.450s

Dave
--
J. David Anglin                                  [hidden email]
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

#include <fcntl.h>
#include <unistd.h>
#define FRAG_SIZE (1024 * 1024)
int
main ()
{
  char buf[FRAG_SIZE];
  int fd, i;

  fd = open ("junk", O_WRONLY|O_CREAT);
  memset (buf, 0, FRAG_SIZE);
  for (i = 1024; i; i--)
    {
      size_t cnt = FRAG_SIZE;
      ssize_t n;
      char *p = buf;

      while (cnt)
        {
          n = write (fd, p, cnt);
          if (n == -1)
            abort ();
          cnt -= n;
          p += n;
        }
    }
}
Reply | Threaded
Open this post in threaded view
|

Re: hppa build broken

Steve Ellcey
In reply to this post by John David Anglin-4
> Does this help with performance on your workstation?  I've checked
> that it does the right thing but it it doesn't seem to help with
> performance on my A550.  I'm updating the installed SCSI patches
> but I'm not too hopeful that it will help.  I tried two different
> drives using different controllers.

Hm, I am surprised you are not seeing a performance improvement.
I have been testing with the test case:

        .data
bar:
        .block 0x3fffffff

On a 9000/785 workstation running 11.00.  I see a large improvement in
the time the assembler takes to run (2 to 3 minutes with my patch).

I no longer think my patch is the way to fix this though, I think Alan's
change is the right thing to do in pa_block() and that this problem
should be fixed in gas/write.c.  In write_contents(), there is code that
uses a buffer (char buf[256]) to write out data.  This means that our
zeros are being written out 256 bytes at a time.  I increased 256 to 1M
(1048576) and got the same performance improvement that I had with my
earlier patch.

Steve Ellcey
[hidden email]