Week 4 and 5 of Gsoc with Sympy.
Self-Initializing Quadratic Sieve
I spent these 14 days implementing the Quadratic sieve. I took this research paper as a reference.
In a basic level, in quadratic sieve we try to find relations of the form X^2 ≡ Y^2 modN
where N
is the number to be factored and X ≠ Y
. Then we can rewrite the relation
as (X + Y)*(X - Y) ≡ 0 modN
. Then gcd of X - Y
and N
will be a proper factor of N
. To find these X
and Y
such that X^2 ≡ Y^2 mod N
we first find relations of the form
u^2 = v modN
where v
factors into small primes. These are called smooth relations. By collecting enough of these smooth relations we can find a subset of v
for which
π(u^2) = q1*q2....
where qi
is a square. Thus π(v)
will also be a square. Then taking X^2 = π(u^2)
and Y^2 = π(v)
we can find a proper factor of N.
Now lets come to a detail level of Self-Initializing Quadratic sive:
In Self-Initializing Quadratic sive we will use g(a, b) = (ax + b)^2 - N
as a sieving polynomial. Here a
is a product of primes that form a subset of the factorbase. And b
is
taken as b^2 = N moda
. Suppose a = q1*q2.....qs
where qi ∈ factor_base
so there are total 2^s
possible values of b
, but we only need half of those because the residues
generated by g(a, b) and g(a, -b)
are same. So for a value of a we have total of 2^(s - 1)
possible polynomials. We will utilize these polynomials for getting smooth relations.
The Self-Initializing Quadratic sive can be divided into several stages:
(i) Compute Startup Data: Here we initialize our factor base wth some upper bound B
. For each value of factorbase we compute a modular square root of N mod p
and store it as
tmem_p
and also compute log(p)
and store it as lp
.
(ii) Initialization stage: In this stage we find a suitable value of a
and generate a 1st
polynomial. And we start seiving with this, after completion we change the sieving
polynomial to the next one for a total of 2^(s - 1)
polynomaials. After we have seived with with all these polynomials the we choose a different value of a
then again we will
get a different set of 2^(s - 1)
polynomials. We repeat this until we get enough smooth relations.
(iii) Sieve Stage: Here we try to find while values of g(a, b)
are probable to be a smooth number that is it can be factored with the primes in our factorbase. We initialize
a 2*M + 1
array with 0’s. Then for each prime we add lp
to all the places which satisfies -M <= soln1p + ip <= M
.
(iv) Trial division stage: Here we look up in this array and if we find a value whose accumulated value is greater than some predefined defined threshold, then g(a, b)
is highly likely to be a smooth number. Then we cross check if g(a, b)
is a smooth number or not. If g(a, b)
is a smooth number then we have found a smooth relation. If
the total number of smooth relations is more than the length of factorbase then proceed to next stage otherwise go to initialization stage and try next polynomial.
(v) Linear Algebra Stage: Here we transform the smooth relations to vectors. For ex: Let our factorbase be :{2, 3, 5}
then 2**6 * 3**3 * 5**7
has a vector [6, 3, 7]
.
Also note that a square number will have a zero vector under modulo 2
operation. So we will transform the vectors to modulo 2
. So, [6, 3, 7]
becomes [0, 1, 1]
. Then we
find a subset of smooth relations for which the combined vector becomes a zero vector. That is π(g(a, b))
gives a zero vector. That means we have found a square. Further,
we can utilize this square combination of relations to find a proper factor of N. (π(ax + b)^2) modN ≡ π(g(a, b)) modN
. Here take u = π(ax + b)^2
and v = π(g(a, b))
. So,
we found u^2 modN ≡ v modN
, here v is a square as v
is a zero vector under modulo 2. So, we have found a relation of the form X^2 ≡ Y^2 mod N
and a proper factor can be found
by taking gcd(X - Y, N)
.