Improvements to powerpc32 asm code

Kevin Ryde user42@zip.com.au
Tue, 03 Jun 2003 11:33:58 +1000


Mark Rodenkirch <mrodenkirch@wi.rr.com> writes:
>
> Yes, that was -C.  Here are the -CD results if you are interested:
>
> 1           (21.1270)    (#10.0622)
> 2              5.0496       #4.0293
> 3              4.0124       #3.0227
> 4             #4.0166        8.0599

No, you need to apply it over steps of 16 limbs or similar, especially
if the code is unrolled to a size like that and hence has special case
finish-ups for various modulo sizes.  See tune/README,

	./speed -s 16-64 -t 16 -C -D mpn_add_n