div_qr_1 interface

Torbjorn Granlund tg at gmplib.org
Mon Oct 21 23:03:50 CEST 2013


nisse at lysator.liu.se (Niels Möller) writes:

  The problem is the final use, where Q2 is added, with carry, to a
  different register. It's tempting to replace
  
  	adc	Q1I, Q2
  
  with
  
  	sbb	Q2, Q1I
  
  and negated Q2, but I'm afraid that will get the sense of the carry
  wrong. Do you see any trick to get that right without negating Q2
  somewhere along the way?
  
Well, no.

  > I might also be possible to replace the early loop "and" stuff by
  > cmov.
  
  Maybe, but the simple way to do conditional addition with lea + cmov
  won't to, since we also need carry out.
  
  Does it matter if we do
  
  	mov	B2, r
          and	mask, r
  
  or
  
  	mov	$0, r
          cmovc	B2, r
  
  ?
  
The latter tends to be faster on AMD CPUs.  Not sure about Intel.

-- 
Torbjörn


More information about the gmp-devel mailing list