mpz_trial_div

Jason Moxham J.L.Moxham@maths.soton.ac.uk
Fri, 8 Nov 2002 20:05:12 +0000


The name of fn mpz_trial_div implies a specific algorithm , which in futu=
re=20
could restrict the optimization possibilitys. Perhaps it should be called=
=20
mpz_find_small_prime_factors ? or some such .

At present my code uses a prime table and/or a wheel to reduce the number=
 of=20
possible divisors and groups these divisors together to increase the spee=
d.=20
So it really is trial division.
I know of two ways to use gcd to increase speed (possibly ?)

1)
After we group together a small number of our trial divisors , then we ta=
ke=20
the gcd of this product (which is small) with N , and see if we get a=20
non-unit gcd. This would still be trial division.

2)
We multiply together lots of trial divisors mod N  and then take GCD with=
 N.
( this may require sub-quadratic gcd) . This is really nothing like trial=
=20
division. It could take the same time to discover that 23 is a divisor th=
an=20
it take to discover 100093 is a divisor.

I have yet to try either of these methods to see if they offer any=20
improvements.


> I proposed the following user visible function
>=20
> unsigned long mpz_trial_div(mpz_srcptr N , unsigned long start , unsign=
ed=20
long=20
> stop)
>=20
> Precondition is that N has no divisors less than start (excude 1)

I originally didn't have this precondition , but added it later to simpli=
fy=20
things. Some reasons why I changed it are

1) With this precondition it is easy to guarentee that the returned divis=
ors=20
are prime , without this precondition , any divisor found may be prime or=
=20
composite and you would still have to run trial_div again to split the=20
returned divisor up further.

2) Without insisting that the returned divisors are prime you can not red=
uce=20
the number of trial divisors by using a prime table.

eg
let N=3D3.5.7.11.(2^31-1)
d=3Dmpz_trial_div(N,1000,10000)
which implies d=3D3.5.7.11=3D1155
so our set of trial divisors must include non-primes

3)
I couldn't think of any situation where this is in any way a serious prob=
lem.


Uses for mpz_trial_div
-------------------------

1)To prove that N has no factor less than 10000
2)To find easy factors of N , before more expensive tests like probable p=
rime=20
tests or factoring
3) to factorize N

Why would anyone want to call mpz_trial_div (as proposed) with random sta=
rt=20
and end ?

When I say the function is undefined , I mean that it returns garbage in =
a=20
resonable time , it does not crash or loop forever , or format you hard d=
isk=20
:) . Technically the answer returned is well defined , just not useful.

consider mpz_divexact which will return garbage with random parameters.

All the uses to which I have put my mpz_trial_div , do not require the=20
divisors in order or to be prime or even what that divisor is. All I have=
=20
needed to know (so far) is that , does a divisor exist or not .

The problem with these GCD methods is that smaller divisors are found no=20
quicker than larger ones . So prove existence or not they are great (assu=
ming=20
they are faster) . But for eliminating common composites before a probabl=
e=20
prime test , they are much slower on average (because small divisors are =
very=20
common)

Perhaps a seperate fn (internal?) for this type of existence of divisors =
fn.
mpz_exist_div(N,start,stop) ?


> I would argue that it should have a "unsigned long *expt" argument whic=
h=20
> returns the multiplicity of the factor returned.

Possible , although I think it is unneccessairy (as we have the mpz fn=20
mpz_remove_ui) , consider

d=3Dmpz_trial_div(N,1,10000);
m=3Dmpz_remove_ui(N,N,d);

to cover the multiplicity case

=46rom Paul
----------------
> Almost all the time when I want to find small factors (and by definitio=
n,=20
>anything that fits into an unsigned long is small) I want all the factor=
s, I=20
>want them to be prime and I want them in order.  Having to write complic=
ated=20
>control structures to find them all, test for primality and sort them in=
to=20
>order is a real PITA.   If I have to go to all that trouble, I may just =
as=20
>well code the trial division myself and avoid all the complexity.  It's=20
>likely to runs faster as well, just because I don't have to go through a=
ll=20
>the post-processing.  The whole point of a library is to remove the need=
 to=20
>rewrite frequently used code.
>
>

I couldn't agree more , What is PITA ?

I don't what to count the number of times I have writen various trial div=
ision=20
routines , because I never had one that was general enough.

mpz_trial_div is also a very low-level fn , perfectly suited to gmp's man=
y cpu=20
optimizations (far more so than mpz_powm , say ) .

>
>> I'm thinking primarily of primetest applications (that's the only
>> things I've used trial division for, so far), and then it's quite
>> useless to return N. So I'd prefer that any "smallest prime factor of
>> N (in a given set)" is specified as returning either a *non-trivial*
>>
>My application has been almost exclusively integer factorization.  Given=
 an=20
>integer, or class of integers, I first filter out all the small prime=20
>factors with trial division, then run a few thousand iterations of Polla=
rd=20
>rho or squfof.  That flushes out everything up to 8 digits or so and I c=
an=20
>turn to ECM, QS or NFS as appropriate.  Trial division is a good candida=
te=20
>for a library and the other two possibly so.  QS and NFS are very likely=
 not=20
>candidates.  ECM is borderline but I'd put it outside the border.
>
>> Are there any other applications where N is a useful answer?
>
>N, if prime, is useful in the factorization scenario given above.  It gi=
ves a=20
>simple test to see whether factorization is complete.
>
>


I hope this answers all your objections ?
If not let me know=20
It was/is the best way I could think of to do this generally

jason