Rd: [RFC] dl-procinfo and HWCAP_IMPORTANT support for powerpc

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

Rd: [RFC] dl-procinfo and HWCAP_IMPORTANT support for powerpc

Brian Grayson-2
Steve Munroe wrote:

> Christoph Hellwig <[hidden email]> wrote on 12/15/2005 06:36:01 AM:
>
> > > We have made a deliberate decision not to identify older processor
> > > generations (POWER3, G4). They are no longer in production and/or are
> > > adequately covered by the current glibc implementation. They will continue
> > > to be supported by current base/default implementation (as defined by gcc
> > > options -mcpu=powerpc for 32-bit and -mcpu=powerpc64 for 64-bit).
> >
> > That's not true.  G4, G3 and even G2-based microarchitectures are still
> > in production at motorola.  Not to mention things like the 8xx.
>
> Did you miss the "or are adequately covered by the current glibc
> implementation" part?
>
> Note: I did not say they are not supported, I did say that I (personally)
> was not going to do anything extra for them. I have my hands full keeping
> on top of the current IBM products (along with all my other jobs).
>
> But you should be supporting my efforts on this and the related --with-cpu
> support. I have done all the heavy lifting to enable the general
> mechanism. If you are supportive of this effort, then please say so.

  First of all, yes, we appreciate what you've done!

> If you are vitally interested in supporting G2, G3, G4 then you will need
> to:

  Freescale (formerly Motorola SPS) is interested in supporting
all of these and more.  The e200, e300 (G2), e500, and e600
(G4) cores are all supported by gcc's -mcpu= options
(e.g., -mcpu=8540), so I think the main thing is to make sure
that the proper bits are allocated now.  e200 has a feature
that would require another HWCAP bit (compressed code, aka
"VLE"), for example.

  By the way, putting on my architecture hat, I noticed that
the current allocation is all in-order from MSB down, going
from computation mode (32/64) to ISA features (601 down to
EFP-double) to system features (unified cache and no timebase),
to chip microarch, with no breaks for future additions.  Would
it be feasible to provide a break between NO_TB and POWER4
(perhaps number the uarchs from LSB up?), so that if a new ISA
extension is done, the bit encoding makes sense?  Something
like "isel" seems worthy of addition, since that is
UISA-visible and can have substantial performance impact.

  Also, is the 32-bit mode "true" 32-bit mode, and not the 970
32-bit mode?  That is, is a binary marked with PPC_FEATURE_32
_guaranteed_ to run on a 601 without any program exceptions?  We
have heard rumors that AIX and Apple, even in 32-bit mode, take
advantage of the fact that 64-bit instructions can still be
executed on a 970 in 32-bit mode.

> 1) Negotiate with Paul Mackerras to get bits allocated in AT_HWCAP from
> the kernel.

  We will work on doing this.

> 2) Verify that gcc has the -mcpu= support you need. If not you need to
> negotiate with David Edelsohn to get any -mcpu= support and __ARCH_
> macro's you might need.

  Done, or in progress, for all of our production parts (e200,
e300, e500, e600).

> 3) Provide detail analysis and justification of why the (widely used)
> processor/chip that you want to support is not well served by the current
> (default) implementation.

  We have seen some significant speedups on certain benchmarks
(including SPEC) using some of the optimized libraries
internally.  If deeper justification is required, we can work on that.

> 4) Provide optimized implementations of high use functions where needed.

  Some of this has already been done (libcfsl_e500,
libmotovec), and more will be done in the future.

> Then we can work together to update dl-procinfo to support what you come
> up with.

  Thanks for working on this.

  Brian Grayson
--
Brian Grayson, SysPerf (System Performance, Modeling, and Simulation)
[hidden email]
Somerset Design Center
Freescale Semiconductor
Austin, TX
Reply | Threaded
Open this post in threaded view
|

Re: Rd: [RFC] dl-procinfo and HWCAP_IMPORTANT support for powerpc

Steve Munroe
Brian Grayson <[hidden email]> wrote on 12/21/2005 04:26:25
PM:

> Steve Munroe wrote:
> > Christoph Hellwig <[hidden email]> wrote on 12/15/2005 06:36:01 AM:
> >
> > > > We have made a deliberate decision not to identify older processor
> > > > generations (POWER3, G4). They are no longer in production and/or
are
> > > > adequately covered by the current glibc implementation. They
> will continue
> > > > to be supported by current base/default implementation (as
> defined by gcc
> > > > options -mcpu=powerpc for 32-bit and -mcpu=powerpc64 for 64-bit).
> > >
> > > That's not true.  G4, G3 and even G2-based microarchitectures are
still
> > > in production at motorola.  Not to mention things like the 8xx.
> >
> > Did you miss the "or are adequately covered by the current glibc
> > implementation" part?
> >
> > Note: I did not say they are not supported, I did say that I
(personally)
> > was not going to do anything extra for them. I have my hands full
keeping
> > on top of the current IBM products (along with all my other jobs).
> >
> > But you should be supporting my efforts on this and the related
--with-cpu
> > support. I have done all the heavy lifting to enable the general
> > mechanism. If you are supportive of this effort, then please say so.
>
>   First of all, yes, we appreciate what you've done!
>
> > If you are vitally interested in supporting G2, G3, G4 then you will
need

> > to:
>
>   Freescale (formerly Motorola SPS) is interested in supporting
> all of these and more.  The e200, e300 (G2), e500, and e600
> (G4) cores are all supported by gcc's -mcpu= options
> (e.g., -mcpu=8540), so I think the main thing is to make sure
> that the proper bits are allocated now.  e200 has a feature
> that would require another HWCAP bit (compressed code, aka
> "VLE"), for example.
>
Wo, Lets not throw in the kitchen sink with this. First gcc -mcpu support
is necessary but not sufficient for mainline glibc support. Also we need to
distinguish between building glibc (which only involves --with-cpu) and
selecting from multiple versions of libc.so at runtime (which does require
AT_HWCAP bits).

You don't need HWCAP_IMPORTANT support to build an SDK for an embedded
processor. You can use --with-cpu= to select any chip optimization you have
written and implicitly assert -mcpu= for the build. Plus it is a separate
negotiation with the glibc maintainers for what code is included in
mainline, ports, or elsewhere.

HWCAP_IMPORTANT support and AT_HWCAP bits are required to allow distros to
ship fat rpms which include more then one copy of libc.so (libm, librt,
libpthread...) and select the best match to the processor chip at run time.
So this feature is specific to building a single binary distribution for
general use (i.e not embedded).

So adding popular desktop and laptop cpu_types (G4, G3) is reasonable,
because this may be of general interest to distro's. But naming every chip
in the inventory is not. There is a limit to the number of variants that
anyone is willing to build ... and that is also a separate negotiation.

Note: I did not ask for AT_HWCAP bits for power, power2, power3 etc. I am
focused on current and future production starting with PowerPC Architecture
Version 2.0 (power4).

Finally. BOOKe processors on not covered by this. I understand that the
common intersection of BOOKe and PowerPC (powerpc32-nofpu) is usable in
restricted circumstances, but the full BOOKe (e500) ISA is not compatible
with the PowerPC ISA. So BOOKe requires a separate ABI and a separate libc
port. That is beyond the scope of this proposal.

>   By the way, putting on my architecture hat, I noticed that
> the current allocation is all in-order from MSB down, going
> from computation mode (32/64) to ISA features (601 down to
> EFP-double) to system features (unified cache and no timebase),
> to chip microarch, with no breaks for future additions.  Would
> it be feasible to provide a break between NO_TB and POWER4
> (perhaps number the uarchs from LSB up?), so that if a new ISA
> extension is done, the bit encoding makes sense?  Something
> like "isel" seems worthy of addition, since that is
> UISA-visible and can have substantial performance impact.
>
Its might be too late for that as the changes is already upstream in 2.6.15

>   Also, is the 32-bit mode "true" 32-bit mode, and not the 970
> 32-bit mode?  That is, is a binary marked with PPC_FEATURE_32
> _guaranteed_ to run on a 601 without any program exceptions?  We
> have heard rumors that AIX and Apple, even in 32-bit mode, take
> advantage of the fact that 64-bit instructions can still be
> executed on a 970 in 32-bit mode.
>
No. PPC_FEATURE_32 is set for any chip that supports at least -mcpu=common.
This includes 601 and all 64-bit implementations. 601 is a special case
because it is not compliant with the full PowerPC Architecture Version 1.0
spec (it has deviations, like not supporting the timebase). So the kernel
sets both PPC_FEATURE_32 and PPC_FEATURE_601_INST for this case. The kernel
sets both PPC_FEATURE_32 and PPC_FEATURE_64 for 64-bit implementations.

> > 1) Negotiate with Paul Mackerras to get bits allocated in AT_HWCAP from

> > the kernel.
>
>   We will work on doing this.
>
> > 2) Verify that gcc has the -mcpu= support you need. If not you need to
> > negotiate with David Edelsohn to get any -mcpu= support and __ARCH_
> > macro's you might need.
>
>   Done, or in progress, for all of our production parts (e200,
> e300, e500, e600).
>
See comment above about BOOKe ...

> > 3) Provide detail analysis and justification of why the (widely used)
> > processor/chip that you want to support is not well served by the
current
> > (default) implementation.
>
>   We have seen some significant speedups on certain benchmarks
> (including SPEC) using some of the optimized libraries
> internally.  If deeper justification is required, we can work on that.
>
> > 4) Provide optimized implementations of high use functions where
needed.
>
>   Some of this has already been done (libcfsl_e500,
> libmotovec), and more will be done in the future.
>
This proposal is about providing optimized implementations of the existing
libc API and library set.  This implies signing the copyrights over to FSF
and porting appropriate code into the glibc (and power-cpu add-on)
framework.

However. Arbitrary and chip specific expansion of the API will likely be
rejected by the maintainers. Any expansion of the API will be require
separate negotiation with the maintainers and community.

> > Then we can work together to update dl-procinfo to support what you
come
> > up with.
>
>   Thanks for working on this.
>

You are welcome!

Steven J. Munroe
Linux on Power Toolchain Architect
IBM Corporation, Linux Technology Center

Reply | Threaded
Open this post in threaded view
|

Re: Rd: [RFC] dl-procinfo and HWCAP_IMPORTANT support for powerpc

Benjamin Herrenschmidt
In reply to this post by Brian Grayson-2

>   By the way, putting on my architecture hat, I noticed that
> the current allocation is all in-order from MSB down, going
> from computation mode (32/64) to ISA features (601 down to
> EFP-double) to system features (unified cache and no timebase),
> to chip microarch, with no breaks for future additions.  Would
> it be feasible to provide a break between NO_TB and POWER4
> (perhaps number the uarchs from LSB up?), so that if a new ISA
> extension is done, the bit encoding makes sense?  Something
> like "isel" seems worthy of addition, since that is
> UISA-visible and can have substantial performance impact.

I think it was a mistake to add the microarch like POWER4 in there. It
should have been a separate entity, possibly the ELF_PLATFORM string.
The change was done in a rush but not properly thought out (and that's
partially my fault too).

>   Also, is the 32-bit mode "true" 32-bit mode, and not the 970
> 32-bit mode?  That is, is a binary marked with PPC_FEATURE_32
> _guaranteed_ to run on a 601 without any program exceptions?  We
> have heard rumors that AIX and Apple, even in 32-bit mode, take
> advantage of the fact that 64-bit instructions can still be
> executed on a 970 in 32-bit mode.

They aren't supposed to do so unless they know for sure the proccessor
can (via the microarch for example). Also, doing so on linux is risky as
currently, signals don't save/restore the top 32 bits of registers for
32 bits processes.

> > 1) Negotiate with Paul Mackerras to get bits allocated in AT_HWCAP from
> > the kernel.
>
>   We will work on doing this.

As I wrote above, I think it's the wrong approach, we'll quickly run out
of bits if we continue putting the microarchitecture there, especially
with all the embedded ones lurking at the door.

We need, I think, to come up with a consistent naming scheme for
ELF_PLATFORM instead, possibly a doublet so we can separate the
processor architecture/family and the actual processor model if we feel
that is necessary.
 

> > 2) Verify that gcc has the -mcpu= support you need. If not you need to
> > negotiate with David Edelsohn to get any -mcpu= support and __ARCH_
> > macro's you might need.
>
>   Done, or in progress, for all of our production parts (e200,
> e300, e500, e600).
>
> > 3) Provide detail analysis and justification of why the (widely used)
> > processor/chip that you want to support is not well served by the current
> > (default) implementation.
>
>   We have seen some significant speedups on certain benchmarks
> (including SPEC) using some of the optimized libraries
> internally.  If deeper justification is required, we can work on that.
>
> > 4) Provide optimized implementations of high use functions where needed.
>
>   Some of this has already been done (libcfsl_e500,
> libmotovec), and more will be done in the future.

In addition, it might be worth using the vdso for some very processor
specific things, thus getting the benefit automatically without
requiring a different libc. I've been pondering putting implementations
of memcpy and spinlocks in there... food for thoughts at this point
though.

Ben.


Reply | Threaded
Open this post in threaded view
|

Re: Rd: [RFC] dl-procinfo and HWCAP_IMPORTANT support for powerpc

Tom Gall

Food for thought .. but if there are good reasons for having perfect
detail as to the processor one is on, why not just send on the PVR and be done
with it.

On Fri, 23 Dec 2005, Benjamin Herrenschmidt wrote:
 

> >   By the way, putting on my architecture hat, I noticed that
> > the current allocation is all in-order from MSB down, going
> > from computation mode (32/64) to ISA features (601 down to
> > EFP-double) to system features (unified cache and no timebase),
> > to chip microarch, with no breaks for future additions.  Would
> > it be feasible to provide a break between NO_TB and POWER4
> > (perhaps number the uarchs from LSB up?), so that if a new ISA
> > extension is done, the bit encoding makes sense?  Something
> > like "isel" seems worthy of addition, since that is
> > UISA-visible and can have substantial performance impact.
>
> I think it was a mistake to add the microarch like POWER4 in there. It
> should have been a separate entity, possibly the ELF_PLATFORM string.
> The change was done in a rush but not properly thought out (and that's
> partially my fault too).
>

Perhaps we ought to compile the performance impacts one might want to
consider so we can continue discussions on what all might (and might not)
be reason to make sure that whatever design we end up with is going to
fit.

<big snip>
 

> > > 4) Provide optimized implementations of high use functions where needed.
> >
> >   Some of this has already been done (libcfsl_e500,
> > libmotovec), and more will be done in the future.
>
> In addition, it might be worth using the vdso for some very processor
> specific things, thus getting the benefit automatically without
> requiring a different libc. I've been pondering putting implementations
> of memcpy and spinlocks in there... food for thoughts at this point
> though.

Personally I think this is good food for thought....  sorta dark chocolate
in the cubboard ... yearning to be eaten ... :-)
 
Regards.

Tom
Reply | Threaded
Open this post in threaded view
|

Re: Rd: [RFC] dl-procinfo and HWCAP_IMPORTANT support for powerpc

Benjamin Herrenschmidt
On Thu, 2005-12-22 at 20:13 -0600, Tom Gall wrote:
> Food for thought .. but if there are good reasons for having perfect
> detail as to the processor one is on, why not just send on the PVR and be done
> with it.

Because you may not need the actual revision :) Anyway, the idea of
using a doublet is just something that came to mind while I was typing,
not a properly thought out thing. I still think the micro architecture
should be out of the HWCAP or we'll just overflow the field.

> > I think it was a mistake to add the microarch like POWER4 in there. It
> > should have been a separate entity, possibly the ELF_PLATFORM string.
> > The change was done in a rush but not properly thought out (and that's
> > partially my fault too).
> >
>
> Perhaps we ought to compile the performance impacts one might want to
> consider so we can continue discussions on what all might (and might not)
> be reason to make sure that whatever design we end up with is going to
> fit.

Yes, well, my main worry is running out of HWCAP bits very soon if we
add microarchitectures there. Especially if we start having the embedded
stuff in.

> > In addition, it might be worth using the vdso for some very processor
> > specific things, thus getting the benefit automatically without
> > requiring a different libc. I've been pondering putting implementations
> > of memcpy and spinlocks in there... food for thoughts at this point
> > though.
>
> Personally I think this is good food for thought....  sorta dark chocolate
> in the cubboard ... yearning to be eaten ... :-)

Yah, good idea :)

Ben.


Reply | Threaded
Open this post in threaded view
|

Re: Rd: [RFC] dl-procinfo and HWCAP_IMPORTANT support for powerpc

Steve Munroe


Benjamin Herrenschmidt <[hidden email]> wrote on 12/22/2005
08:06:07 PM:

> On Thu, 2005-12-22 at 20:13 -0600, Tom Gall wrote:
> > Food for thought .. but if there are good reasons for having perfect
> > detail as to the processor one is on, why not just send on the PVR
> and be done
> > with it.
>
> Because you may not need the actual revision :) Anyway, the idea of
> using a doublet is just something that came to mind while I was typing,
> not a properly thought out thing. I still think the micro architecture
> should be out of the HWCAP or we'll just overflow the field.
>
A doublet will not work for the intended purpose. The dl-procinfo requires
a simple string that can be inserted into the library search path. Also by
prior agreement these strings must match the <cpu_type> strings used by gcc
for --with-cpu= and -mcpu=.

If you want to pass the full detail of PVR we should define a different Aux
Vector entry. We have several requests to get at the PVR from large
MiddleWare applications, so we should provide it, but separate from
AT_PLATFORM.

> > > I think it was a mistake to add the microarch like POWER4 in there.
It
> > > should have been a separate entity, possibly the ELF_PLATFORM string.
> > > The change was done in a rush but not properly thought out (and
that's
> > > partially my fault too).
> > >
> >
> > Perhaps we ought to compile the performance impacts one might want to
> > consider so we can continue discussions on what all might (and might
not)
> > be reason to make sure that whatever design we end up with is going to
> > fit.
>
> Yes, well, my main worry is running out of HWCAP bits very soon if we
> add microarchitectures there. Especially if we start having the embedded
> stuff in.
>
I think the embedded guys are under the missimpression that they need this
(dl_procinfo) support. They don't.

The dl_procinfo mechanism is intended to for "retail" distributions for
"general purpose" computers. Like Apple G3 vs G4 vs G5. In the Intel space
they only support 2 values "i686" and "x86_64". There is practical limit
(2, 3, maybe 4) to the number of <cpu_types> a distro is willing to
build/test. For power we are starting with power4 because that it is the
first implementation of the Version 2.0 PowerPC Architecture (and the
Weakly Consistent Storage Model).

If you are building a SDK or runtime for an embedded processor you can use
--with-cpu= to build a optimized version of glibc.

> > > In addition, it might be worth using the vdso for some very processor
> > > specific things, thus getting the benefit automatically without
> > > requiring a different libc. I've been pondering putting
implementations
> > > of memcpy and spinlocks in there... food for thoughts at this point
> > > though.
> >

Steven J. Munroe
Linux on Power Toolchain Architect
IBM Corporation, Linux Technology Center


Reply | Threaded
Open this post in threaded view
|

Re: Rd: [RFC] dl-procinfo and HWCAP_IMPORTANT support for powerpc

Benjamin Herrenschmidt

> The dl_procinfo mechanism is intended to for "retail" distributions for
> "general purpose" computers. Like Apple G3 vs G4 vs G5. In the Intel space
> they only support 2 values "i686" and "x86_64". There is practical limit
> (2, 3, maybe 4) to the number of <cpu_types> a distro is willing to
> build/test. For power we are starting with power4 because that it is the
> first implementation of the Version 2.0 PowerPC Architecture (and the
> Weakly Consistent Storage Model).
>
> If you are building a SDK or runtime for an embedded processor you can use
> --with-cpu= to build a optimized version of glibc.

Even then, ignoring embedded if you think that's a good idea (I know
some embedded folks who will not agree here), I still think we are
asking for trouble around the corner. I mean, you know how many
micro-architectures we have, we should probably at least add the
commonly used freescale ones (g4 typically), then there is cell, and
things are still evolving.

I really think it's not a good idea to mix the actual microarchitecture
of the processor with the feature bits.



Reply | Threaded
Open this post in threaded view
|

Re: Rd: [RFC] dl-procinfo and HWCAP_IMPORTANT support for powerpc

Steve Munroe

Benjamin Herrenschmidt <[hidden email]> wrote on 12/24/2005
12:47:22 AM:

>
> > The dl_procinfo mechanism is intended to for "retail" distributions for
> > "general purpose" computers. Like Apple G3 vs G4 vs G5. In the Intel
space
> > they only support 2 values "i686" and "x86_64". There is practical
limit
> > (2, 3, maybe 4) to the number of <cpu_types> a distro is willing to
> > build/test. For power we are starting with power4 because that it is
the
> > first implementation of the Version 2.0 PowerPC Architecture (and the
> > Weakly Consistent Storage Model).
> >
> > If you are building a SDK or runtime for an embedded processor you can
use
> > --with-cpu= to build a optimized version of glibc.
>
> Even then, ignoring embedded if you think that's a good idea (I know
> some embedded folks who will not agree here), I still think we are
> asking for trouble around the corner. I mean, you know how many
> micro-architectures we have, we should probably at least add the
> commonly used freescale ones (g4 typically), then there is cell, and
> things are still evolving.
>

I am NOT ignoring embedded. The dl-procinfo mechanism is just part of a
comprehensive proposal which starts will the --with-cpu= support. This
(--with-cpu=) mechanism can support an arbitrary number of <cpu_type>
implementations.

The dl-procinfo mechanism supports a specific delivery model (fat rpms with
multiple implementations of libraries). This mechanism is about "delivery
to the customer" associated with general purpose distro's. So the Apple G3,
G4's ... should be added to the dl-procinfo mechanism.

However it is my understanding that embedded processors don't (can't) use
the general purpose distro's. A embedded Linux SDK can be very specific to
the chip and doesn't need the fat rpm mechanism. So IMHO the dl-procinfo
mechanism would never be used in practice for the embedded space (they will
build the single libc.so they need and ship that with the SDK).

So the key to resolving this issue is to understand how the embedded folks
deliver the SDK to their customers. If I am misinformed, then some one who
actually works on Linux for the embedded space, should speak up.

> I really think it's not a good idea to mix the actual microarchitecture
> of the processor with the feature bits.
>

Fine. I asked for the AT_HWCAP bits because it was implemented, AT_PLATFORM
was not. It was just easier to extend an existing mechanism then define a
new one. Also in the current dl-procinfo mechanism, AT_HWCAP allows for
multiple modifiers (like ALTIVEC combined with POWER4 or G4). AT_PLATFORM
does not. This allows for "commoning up" based on ISA features independent
of microarch.

Steven J. Munroe
Linux on Power Toolchain Architect
IBM Corporation, Linux Technology Center

Reply | Threaded
Open this post in threaded view
|

Re: Rd: [RFC] dl-procinfo and HWCAP_IMPORTANT support for powerpc

Benjamin Herrenschmidt
On Mon, 2005-12-26 at 10:09 -0600, Steve Munroe wrote:

> I am NOT ignoring embedded. The dl-procinfo mechanism is just part of a
> comprehensive proposal which starts will the --with-cpu= support. This
> (--with-cpu=) mechanism can support an arbitrary number of <cpu_type>
> implementations.
>
> The dl-procinfo mechanism supports a specific delivery model (fat rpms with
> multiple implementations of libraries). This mechanism is about "delivery
> to the customer" associated with general purpose distro's. So the Apple G3,
> G4's ... should be added to the dl-procinfo mechanism.

 .../...

Heh, there is no need to bring up the buzzwords ! :)

I know what you want and I agree, I was just trying to point a specific
technical detail of the implementation, which I think might corner us in
the future if we aren't careful.

> However it is my understanding that embedded processors don't (can't) use
> the general purpose distro's. A embedded Linux SDK can be very specific to
> the chip and doesn't need the fat rpm mechanism. So IMHO the dl-procinfo
> mechanism would never be used in practice for the embedded space (they will
> build the single libc.so they need and ship that with the SDK).

It depends. Been there done that. You can have a line of products using
different ppc embedded processors and wanting to have your core userland
identical for maintainance/upgrade reasons. In fact, "embedded" covers
pretty much any situation which is why I'm keen on making sure our
solution is scalable enough.

> So the key to resolving this issue is to understand how the embedded folks
> deliver the SDK to their customers. If I am misinformed, then some one who
> actually works on Linux for the embedded space, should speak up.

They all do differently. Anyway, there is no need to focus on embedded
now, it's just one piece of the puzzle.

> Fine. I asked for the AT_HWCAP bits because it was implemented, AT_PLATFORM
> was not. It was just easier to extend an existing mechanism then define a
> new one.

Euh...yes, but AT_HWCAP was implemented for feature bits. We just only
added some microarchitectures to it, afaik, upon your request. It's all
new enough that I don't see the problem in discussing it and possibly
coming with a better alternative.

>  Also in the current dl-procinfo mechanism, AT_HWCAP allows for
> multiple modifiers (like ALTIVEC combined with POWER4 or G4). AT_PLATFORM
> does not. This allows for "commoning up" based on ISA features independent
> of microarch.

Yes, but that isn't incompatible. For example, a 970 could/would be a
POWER4 microarch with the altivec feature bit set. In general, features
like altivec _has_ to be independant bits anyway since LPAR environments
might make them unavailable even if the processor we are running on at a
given point in time supports them.

Ben.


Reply | Threaded
Open this post in threaded view
|

Re: Rd: [RFC] dl-procinfo and HWCAP_IMPORTANT support for powerpc

Steve Munroe


Benjamin Herrenschmidt <[hidden email]> wrote on 12/26/2005
04:04:46 PM:

> On Mon, 2005-12-26 at 10:09 -0600, Steve Munroe wrote:
>
> Heh, there is no need to bring up the buzzwords ! :)
>
> I know what you want and I agree, I was just trying to point a specific
> technical detail of the implementation, which I think might corner us in
> the future if we aren't careful.
>

> > However it is my understanding that embedded processors don't (can't)
use
> > the general purpose distro's. A embedded Linux SDK can be very specific
to
> > the chip and doesn't need the fat rpm mechanism. So IMHO the
dl-procinfo
> > mechanism would never be used in practice for the embedded space (they
will
> > build the single libc.so they need and ship that with the SDK).
>
> It depends. Been there done that. You can have a line of products using
> different ppc embedded processors and wanting to have your core userland
> identical for maintainance/upgrade reasons. In fact, "embedded" covers
> pretty much any situation which is why I'm keen on making sure our
> solution is scalable enough.
>
I have no problem with supporting AT_PLATFORM. I just think that 2 dozen
unique platform names could be problematic for the purpose of fat rpms and
selecting CPU-tuned libraries. ie. a simple "G4" is better than "7400" and
"7450" and ...

I also understand the for other purposes a "fine grained" (exactly whihc
chip) AT_PLATFORM might seem more useful. That I why I suggested that
providing the PVR value via a different Aux Vector entry.

> > So the key to resolving this issue is to understand how the embedded
folks
> > deliver the SDK to their customers. If I am misinformed, then some one
who
> > actually works on Linux for the embedded space, should speak up.
>
> They all do differently. Anyway, there is no need to focus on embedded
> now, it's just one piece of the puzzle.
>
But I just realized that ppc64 kernel will have different/separate
AT_PLATFORM list from the ppc32 kernel.

PPC64 would return "power4", "power5", "power5+" and "970" ("power3" and
"630" if you want to) for AT_PLATFORM which is OK.

The list for the ppc32 kernel is not my (direct) concern.

> > Fine. I asked for the AT_HWCAP bits because it was implemented,
AT_PLATFORM
> > was not. It was just easier to extend an existing mechanism then define
a
> > new one.
>
> Euh...yes, but AT_HWCAP was implemented for feature bits. We just only
> added some microarchitectures to it, afaik, upon your request. It's all
> new enough that I don't see the problem in discussing it and possibly
> coming with a better alternative.
>
OK but consider this:

"power4" maps to PowerPC ISA Version 2.00
"power5" maps to PowerPC ISA Version 2.01
"power5+" maps to PowerPC ISA Version 2.02

These AT_HWCAP bits ARE valid features (each level adds new instructions to
the ISA, as well as microarch discriminators). I think "cell" has other
unique ISA features that justifies its inclusion in the AT_HWCAP.

I don't have this kind info for the various flavours of PPC32 processor. I
happily defer to those who have this knowledge to decide.

> >  Also in the current dl-procinfo mechanism, AT_HWCAP allows for
> > multiple modifiers (like ALTIVEC combined with POWER4 or G4).
AT_PLATFORM
> > does not. This allows for "commoning up" based on ISA features
independent
> > of microarch.
>
> Yes, but that isn't incompatible. For example, a 970 could/would be a
> POWER4 microarch with the altivec feature bit set. In general, features
> like altivec _has_ to be independant bits anyway since LPAR environments
> might make them unavailable even if the processor we are running on at a
> given point in time supports them.
>
I did some experimenting with how dl_procinfo behaves and found that the
combination of AT_HWCAP and AT_PLATFORM can get complicated and redundant.

For example with the current AT_HWCAP (current 2.6.15), dl-procinfo would
add the following the following search paths for a G5/JS20:

/lib[64]/altivec/power4:
/lib[64]/altivec:
/lib[64]/power4:

If the kernel also supported AT_PLATFORM and returned "970", dl-procinfo
would combine this with the AT_HWCAP & HWCAP_IMPORTANT bits and add the
follow search paths:

/lib[64]/970/altivec/power4:
/lib[64]/970/altivec:
/lib[64]/970/power4:
/lib[64]/970:
/lib[64]/altivec/power4:
/lib[64]/altivec:
/lib[64]/power4:

Which seems a bit redundant to me. But if I knew that AT_PLATFORM was
available the HWCAP_IMPORTANT mask could be adjusted to only include
PPC_FEATURE_ALTIVEC which would yield the following search paths:

/lib[64]/970/altivec:
/lib[64]/970:
/lib[64]/altivec:

For a "power5" it would simply add:

/lib[64]/power5:

So it matters when and how AT_PLATFORM is introduced. The HWCAP_IMPORTANT
mask will have to be adjusted when AT_PLATFORM support is implemented and
the search paths will change. Changing the search path later would be
really bad. So if we are going to add AT_PLATFORM support we should do it
ASAP.

I offer the following proposal for powerpc64: Add AT_PLATFORM support where
the <platform> strings match the <cpu_type> strings supported by gcc. This
harmonises library search path names with --with-cpu= targets.

One detail to work out relates to G5. Is the AT_PLATFORM for a G5, "G5" or
"970". Both are valid -mcpu= strings but I should ask David Edelsohn if
there is any different in the code gen. It would be simpler to use "970"
for both G5 and JS20.

This nets out to: "power2", "power3", "power4", "power5", "power5+", and
"970" (and "cell"?).

The AT_PLATFORM details for power32 kernel are more complicated and can be
resolved independently.

Steven J. Munroe
Linux on Power Toolchain Architect
IBM Corporation, Linux Technology Center

> Ben.
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Rd: [RFC] dl-procinfo and HWCAP_IMPORTANT support for powerpc

Benjamin Herrenschmidt
On Wed, 2005-12-28 at 00:11 -0600, Steve Munroe wrote:

> "power4" maps to PowerPC ISA Version 2.00
> "power5" maps to PowerPC ISA Version 2.01
> "power5+" maps to PowerPC ISA Version 2.02
>
> These AT_HWCAP bits ARE valid features (each level adds new instructions to
> the ISA, as well as microarch discriminators). I think "cell" has other
> unique ISA features that justifies its inclusion in the AT_HWCAP.

Yup, though there are additional implementation specific feature bits
that gets added to that, but yes.

> I don't have this kind info for the various flavours of PPC32 processor. I
> happily defer to those who have this knowledge to decide.

We should discuss that with the freescale folks.

> If the kernel also supported AT_PLATFORM and returned "970", dl-procinfo
> would combine this with the AT_HWCAP & HWCAP_IMPORTANT bits and add the
> follow search paths:

Which raises the question of wether we should return "970" or "power4"
in AT_PLATFORM... I understand your problem. In fact, we are really
dealing with 4 type of informations here :

1 - The microarchitecture (or family). In this regard, P4 and P5 are the
same as they basically have the same instruction scheduling
requirements, no ? While Cell is different ...

2 - The actual processor model (P4,P4+,970,P5,P5?+,...)

3 - The specific processor revision (The actual PVR).

4 - Additional feature bits that define user-interesting features that
can exist accross microarchitectures, like Altivec

1 is pretty much what we just added to HWCAP, though it can still be
debated wether it should be one bit per microarch, or if we should
reserve, let's say 8 bits, and have a value stuffed there. That would
give us more flexibility in the future... 4 was already there, it's the
old-style HWCAP usage.

2 and 3 are basically the 2 half of the PVR (2 is the top bits, 3 is the
entire PVR).  

>From what I've seen so far, the -mcpu in gcc tends to try to "know"
about every single CPU revision out there, which is way beyound our
realistic needs I suppose...

(BTW. Do we really need to differenciate P5 and P5+ at that level ? Do
they have any user-space significant difference ?)
 
> I offer the following proposal for powerpc64: Add AT_PLATFORM support where
> the <platform> strings match the <cpu_type> strings supported by gcc. This
> harmonises library search path names with --with-cpu= targets.
>
> One detail to work out relates to G5. Is the AT_PLATFORM for a G5, "G5" or
> "970". Both are valid -mcpu= strings but I should ask David Edelsohn if
> there is any different in the code gen. It would be simpler to use "970"
> for both G5 and JS20.

Agreed.

> This nets out to: "power2", "power3", "power4", "power5", "power5+", and
> "970" (and "cell"?).
>
> The AT_PLATFORM details for power32 kernel are more complicated and can be
> resolved independently.


Reply | Threaded
Open this post in threaded view
|

Re: Rd: [RFC] dl-procinfo and HWCAP_IMPORTANT support for powerpc

Steve Munroe


Benjamin Herrenschmidt <[hidden email]> wrote on 12/28/2005
12:39:40 AM:

> On Wed, 2005-12-28 at 00:11 -0600, Steve Munroe wrote:
>
> > If the kernel also supported AT_PLATFORM and returned "970",
dl-procinfo
> > would combine this with the AT_HWCAP & HWCAP_IMPORTANT bits and add the
> > follow search paths:
>
> Which raises the question of wether we should return "970" or "power4"
> in AT_PLATFORM... I understand your problem. In fact, we are really
> dealing with 4 type of informations here :
>
It depends on planned usage of AT_PLATFORM. If the primary usage is for
dl-procinfo library search path selection, then power4 is a better choice.
For scalar code (libc, libm, ...) the instruction set and scheduling are
the same. Vector code can still use the altivec directory via AT_HWCAP and
PPC_FEATURE_ALTIVEC.

Alternatively there would be separate "power4" and "970" directories and
970 libraries would be duplicates or symlinked back to the power4
libraries.

> 1 - The microarchitecture (or family). In this regard, P4 and P5 are the
> same as they basically have the same instruction scheduling
> requirements, no ? While Cell is different ...
>
Power5 has a deeper storage queue then power4 which impacts scheduling.
Also power5 adds the popcntb instruction (ISA version 2.01).

> 2 - The actual processor model (P4,P4+,970,P5,P5?+,...)
>
> 3 - The specific processor revision (The actual PVR).
>
> 4 - Additional feature bits that define user-interesting features that
> can exist accross microarchitectures, like Altivec
>
> 1 is pretty much what we just added to HWCAP, though it can still be
> debated wether it should be one bit per microarch, or if we should
> reserve, let's say 8 bits, and have a value stuffed there. That would
> give us more flexibility in the future... 4 was already there, it's the
> old-style HWCAP usage.
>
Perhaps we stumbled into this by accident and the bits are miss named, but
PPC_FEATURE_[POWER4|POWER5|POWER%_PLUS] do represent additional ISA
features.

I am not sure we want to mix scalar and bits values in AT_HWCAP.

> 2 and 3 are basically the 2 half of the PVR (2 is the top bits, 3 is the
> entire PVR).
>
> >From what I've seen so far, the -mcpu in gcc tends to try to "know"
> about every single CPU revision out there, which is way beyound our
> realistic needs I suppose...
>
> (BTW. Do we really need to differenciate P5 and P5+ at that level ? Do
> they have any user-space significant difference ?)
>
The PowerPC ISA Version 2.02 (P5+) adds 5 new user state instructions.
These should provide measurable improvement in SPECfp.

Steven J. Munroe
Linux on Power Toolchain Architect
IBM Corporation, Linux Technology Center

Reply | Threaded
Open this post in threaded view
|

Re: Rd: [RFC] dl-procinfo and HWCAP_IMPORTANT support for powerpc

Brian Grayson-2
In reply to this post by Benjamin Herrenschmidt
On Tue, Dec 27, 2005 at 11:39:40PM -0700, Benjamin Herrenschmidt wrote:
> On Wed, 2005-12-28 at 00:11 -0600, Steve Munroe wrote:
> > I don't have this kind info for the various flavours of PPC32 processor. I
> > happily defer to those who have this knowledge to decide.
>
> We should discuss that with the freescale folks.

  Two quick points, and some info about APUinfo for those that
don't know about it:

  - FSL usually adds any new instructions as an optional APU,
    which tends to be self-contained (AltiVec, SPE, VLE) or
    mostly for privileged use (perfmon, machine check,
    cache-locking, branch-locking).  isel is one notable
    exception to this.

  - FSL has usually avoided adding or removing from the _base_
    instruction set, for fear of fragmenting the ISA and thus
    causing our customers no end of grief ("my e9999 no-AltiVec
    no-isel binary won't run on my old 603.  What gives?") -- I
    think FSL sells more legacy processors (like 603e-derived
    SOCs) than IBM does, so our customers' and our concerns are
    flavored by that.

  A few years ago, the e500 ABI specified mechanisms for both
toolchains and code to obtain information about what APUs are
expected:

  - the .PPC.EMB.apuinfo section in an object file or
    executable specifies what version of which APUs is required
    by the object.

  - the __get_apu_revision() function provides a mechanism for
    userland to identify what APUs are available on the current
    silicon, and/or emulatable by the current OS.

  Note that this was not meant to be an e500-only feature, but
rather something that could be used on many cores going forward.

  I believe apuinfo section support is in most embedded PPC
toolchains nowadays (it's in binutils, for example), but I am
not sure if all OSs support __get_apu_revision() yet.  We did
not specify an AUX_VECTOR type to specify APUinfo, but one
could be added.

  It sounds like AT_HWCAP and apuinfo are two different
solutions to the same problem, with different strengths etc.
The APUinfo section provides finer-grain control (which
revision of which APU, with support for up to 2^16 APUs each
with up to 2^16 revisions), whereas HWCAP, with its limited
bits, can only support the coarse yes/no control.

  Brian
Reply | Threaded
Open this post in threaded view
|

Re: Rd: [RFC] dl-procinfo and HWCAP_IMPORTANT support for powerpc

Steve Munroe
Brian Grayson <[hidden email]> wrote on 12/28/2005 03:34:17
PM:

> On Tue, Dec 27, 2005 at 11:39:40PM -0700, Benjamin Herrenschmidt wrote:
> > On Wed, 2005-12-28 at 00:11 -0600, Steve Munroe wrote:
> > > I don't have this kind info for the various flavours of PPC32
processor. I
> > > happily defer to those who have this knowledge to decide.
> >
> > We should discuss that with the freescale folks.
> ...

>   A few years ago, the e500 ABI specified mechanisms for both
> toolchains and code to obtain information about what APUs are
> expected:
>
>   - the .PPC.EMB.apuinfo section in an object file or
>     executable specifies what version of which APUs is required
>     by the object.
>
>   - the __get_apu_revision() function provides a mechanism for
>     userland to identify what APUs are available on the current
>     silicon, and/or emulatable by the current OS.
>
>   Note that this was not meant to be an e500-only feature, but
> rather something that could be used on many cores going forward.
>
>   I believe apuinfo section support is in most embedded PPC
> toolchains nowadays (it's in binutils, for example), but I am
> not sure if all OSs support __get_apu_revision() yet.  We did
> not specify an AUX_VECTOR type to specify APUinfo, but one
> could be added.
>
Note that glibc does not support the BOOKe ABI. Also __get_apu_revision()
is processor specific so not likely to be included in mainline glibc.
Embedded toolchains may be handling this a local mode or a separate
utilities library.
>
>   It sounds like AT_HWCAP and apuinfo are two different
> solutions to the same problem, with different strengths etc.
> The APUinfo section provides finer-grain control (which
> revision of which APU, with support for up to 2^16 APUs each
> with up to 2^16 revisions), whereas HWCAP, with its limited
> bits, can only support the coarse yes/no control.
>
It sounds like to the PowerPC PVR. And I agree that it is different from
AT_HWCAP. The AT_PLATFORM would be derived from the PVR but again at lower
detail (a single string matching one of the -mcpu=<cpu_type> values.

Our kernel emulates the mfspr rx,PVR instruction for user state so
applications can get the PVR directly (at least with newer kernels).

Alternatively we could ask for a new Aux Vector entry (i.e) AT_PROCVER
that contains the PVR/APUinfo. I am not sure what the process is to add
new AT_* type beyond talking to the kernel folks. Eventual it should be
documented in the ELF ABI.

Steven J. Munroe
Linux on Power Toolchain Architect
IBM Corporation, Linux Technology Center


Reply | Threaded
Open this post in threaded view
|

Re: Rd: [RFC] dl-procinfo and HWCAP_IMPORTANT support for powerpc

Brian Grayson-2
On Thu, Jan 05, 2006 at 01:33:33PM -0600, Steve Munroe wrote:
> Brian Grayson <[hidden email]> wrote on 12/28/2005 03:34:17 PM:
...
> >   A few years ago, the e500 ABI specified mechanisms for both
> > toolchains and code to obtain information about what APUs are
> > expected:
...
> Note that glibc does not support the BOOKe ABI.

  There seems to be some confusion about Book E.  I want to
point out a few things here so that the confusion is not
perpetuated.

  There is no Book E ABI, as Book E is userland-compatible with
classic Book 1.  Nearly all of Book E's changes pertain to
supervisor mode (Book 3): different MMU style that supports
easier programming and variable size pages, recoverable
machine-check, higher-priority critical interrupts,
programmable exception vector locations, etc.  Many of these
features are also in the Book E-compliant IBM 440 parts, which
AFAIK did not require a new ABI.

  It just so happens that the first Book E implementation from
Freescale, the e500 core, also added a DSP-like Signal
Processing Engine (SPE), which overlays on top of the GPRs.
_That_ required a new ABI, the e500 ABI, to handle the changes
w.r.t. passing this new datatype __ev64_opaque__.  The changes
are very similar to what was done for AltiVec in the 90s,
except that Freescale decided to also pull in some of the EABI
changes, so that there wouldn't need to be an e500 ABI _and_ an
e500 EABI.  So, sdata0 is supported, a few new relocations are
supported, and software floating-point emulation is documented.

  But note that, just like the EABI and the AltiVec Supplement,
the e500 ABI is compatible with the classic ABI where possible.
For ordinary integer code it is highly likely that it can be
linked dynamically or statically with classic ABI libraries,
and everything will just work.  Same goes for AltiVec -- most
applications that take advantage of AltiVec and its ABI, with
its different stack alignment, larger jmp_buf, additional
printf/scanf support, etc. will Just Work when linked with
non-AltiVec-aware code.

  So, to summarize, Book E != SPE.  glibc already supports Book
E, because it supports classic.

  Brian
--
Brian Grayson, SysPerf (System Performance, Modeling, and Simulation)
[hidden email]
Somerset Design Center
Freescale Semiconductor
Austin, TX

Reply | Threaded
Open this post in threaded view
|

Re: Rd: [RFC] dl-procinfo and HWCAP_IMPORTANT support for powerpc

Steve Munroe
Brian Grayson <[hidden email]> wrote on 01/06/2006 09:38:53
AM:

> On Thu, Jan 05, 2006 at 01:33:33PM -0600, Steve Munroe wrote:
> > Brian Grayson <[hidden email]> wrote on 12/28/2005
03:34:17 PM:

> ...
> > >   A few years ago, the e500 ABI specified mechanisms for both
> > > toolchains and code to obtain information about what APUs are
> > > expected:
> ...
> > Note that glibc does not support the BOOKe ABI.
>
>   There seems to be some confusion about Book E.  I want to
> point out a few things here so that the confusion is not
> perpetuated.
>
>   There is no Book E ABI, as Book E is userland-compatible with
> classic Book 1.  Nearly all of Book E's changes pertain to
> supervisor mode (Book 3): different MMU style that supports
> easier programming and variable size pages, recoverable
> machine-check, higher-priority critical interrupts,
> programmable exception vector locations, etc.  Many of these
> features are also in the Book E-compliant IBM 440 parts, which
> AFAIK did not require a new ABI.
>
I guess I was confused by the 78,800 hits google returns for "eABI"

>   It just so happens that the first Book E implementation from
> Freescale, the e500 core, also added a DSP-like Signal
> Processing Engine (SPE), which overlays on top of the GPRs.
> _That_ required a new ABI, the e500 ABI, to handle the changes
> w.r.t. passing this new datatype __ev64_opaque__.  The changes
> are very similar to what was done for AltiVec in the 90s,
> except that Freescale decided to also pull in some of the EABI
> changes, so that there wouldn't need to be an e500 ABI _and_ an
> e500 EABI.  So, sdata0 is supported, a few new relocations are
> supported, and software floating-point emulation is documented.
>
Ok, so the e500 core is supported in the subset powerpc32/nofpu form.
SPE requires a separate ABI and a separate port.

AltiVec/VMX it was separate facility with independent register set. So it
could be added as a versioned extention to the existing ABI. And that was
no walk in the park.

The SPE overlaying the gprs with fprs changes the ABI and breaks so many
internal dependences it has to be a separate ABI and port. This means that
it has to have a different configure target and SPE unique code has to go
into a different part of the tree.

>   But note that, just like the EABI and the AltiVec Supplement,
> the e500 ABI is compatible with the classic ABI where possible.
> For ordinary integer code it is highly likely that it can be
> linked dynamically or statically with classic ABI libraries,
> and everything will just work.  Same goes for AltiVec -- most
> applications that take advantage of AltiVec and its ABI, with
> its different stack alignment, larger jmp_buf, additional
> printf/scanf support, etc. will Just Work when linked with
> non-AltiVec-aware code.
>
What are you proposing here? Running e500 in the powerpc32/nofpu ABI
subset, OK. Support for SPE is another matter.

>   So, to summarize, Book E != SPE.  glibc already supports Book
> E, because it supports classic.
>
My concern is that supporting "classic" is holding back powerpc
performance in general. I started this proposal (and what became
--with-cpu) to enable the full potential of PowerPC Version 2.0+
Architecture where it exists, while continuing support for the "classic"
powerpc.

So my goal is to enable a separation of concerns between the (sometimes)
competing interests of the "enterprise", "desktop", and "embedded" worlds.

I think the AT_PLATFORM+dl_procinfo proposal (and --with-cpu=) does that.


Steven J. Munroe
Linux on Power Toolchain Architect
IBM Corporation, Linux Technology Center


Reply | Threaded
Open this post in threaded view
|

Re: Rd: [RFC] dl-procinfo and HWCAP_IMPORTANT support for powerpc

Roland McGrath
In reply to this post by Steve Munroe
> Fine. I asked for the AT_HWCAP bits because it was implemented, AT_PLATFORM
> was not.

That's inaccurate.  All it takes it defining ELF_PLATFORM in the kernel.
On the glibc side, using important hwcaps or known platform strings are
both supported, whatever your machine wants to do.  I'm sure that what I
told you to do from the very start was "AT_HWCAP and/or AT_PLATFORM,
whatever the kernel community for your architecture decides."


Thanks,
Roland

Reply | Threaded
Open this post in threaded view
|

Re: Rd: [RFC] dl-procinfo and HWCAP_IMPORTANT support for powerpc

Roland McGrath
In reply to this post by Steve Munroe
> I have no problem with supporting AT_PLATFORM. I just think that 2 dozen
> unique platform names could be problematic for the purpose of fat rpms and
> selecting CPU-tuned libraries. ie. a simple "G4" is better than "7400" and
> "7450" and ...

It's certainly the case that makers of general-purpose systems are not
going to want to include umpteen different tuned builds in their fat
library packages or their installation options for different packages.
For example, Fedora has only ever had builds for two or three flavors of
anything at most on x86.  That said, there is no big harm in having a
larger number of known platform strings in glibc.  (This is contrary to
going willy-nilly with HWCAP_IMPORTANT bits, which translate to many more
directories searched for combinations of bits set, rather than just
replacing the single platform name with another in the same one search step.)

The AT_PLATFORM string is compared against the known list in a linear
search, so you want the likely values to be first in the list and not to
have so large a number of likely values that doing several strcmps before
hitting is commonplace.  The implementation makes it unwieldly to have
more than say 16, which already sounds like too many for the strcmp
consideration.

Those are the only constraints on having a lot of known platform strings.
What's more important, and is the same for either hwcap bits or platform
strings, is that it's somewhat of a pain to decide to add more later.
With a platform string, the kernel can change and path-based library
searching will use whatever string it gives.  However, if ldconfig has
seen the library in question, which is normal use, then with either an
hwcap or a platform string, the cache-based library searches (the normal
case) will ignore the new platform string unless it's in the recognized
set built into ld.so and ldconfig.  So, you are well-advised to decide now
what set of platform strings you'll care about for new hardware within the
lifetime of systems built in the coming months (i.e. the next major
versions of everyone's systems, that use 2.4 after it's released).

> I offer the following proposal for powerpc64: Add AT_PLATFORM support where
> the <platform> strings match the <cpu_type> strings supported by gcc. This
> harmonises library search path names with --with-cpu= targets.

These strings are used as directory names in places like /lib, not just in
places like -mcpu= options where it is quite clear that a flavor of
PowerPC CPU is what that string is.  The only existing practice using
these is on x86, where the strings are i?86, which is the same as a CPU
identifier seen alone in places like uname output and config tuples.
To me, it seems unwise to have these strings be something that sounds so
generic out of context as "cell" or "970".  I tend to think all the
strings for powerpc should start with "power" or "ppc".

> The AT_PLATFORM details for power32 kernel are more complicated and can be
> resolved independently.

Indeed, and we on the libc lists would be more than happy if you all took
this discussion of powerpc CPU details elsewhere and resolve it amongst
yourselves what set of hwcap bits and platform strings you want the kernel
to potentially supply.  Aside from the spelling nits of their canonical
names known to glibc, it's not a libc issue.  The libc issue is what what
hwcap bits to use in library search, and what set of platform strings to
put in the known list for ldconfig cached search.  For the reasons I
mentioned above, you want to get this right the first time as much as
possible, and get that in place before we release 2.4, which means it
should be decided upon within the next few weeks ideally.


Thanks,
Roland