At the beginning of this series, we argued that the multiplication of two polynomials of degree at most n can be performed in O(n) complexity time if both polynomials are in point-value form.
In practice, the difficulty is that polynomials are usually given in coefficient form, and we want the result of the multiplication to also be in coefficient form.
Fortunately, these problems can be solved by using the fast versions of the NTT and INTT.
To multiply two polynomials represented in coefficient form in O(nlogn) time, the procedure is as follows:
Using the NTT, the polynomials are converted from coefficient form to point-value form.
The polynomials in point-value form are multiplied pointwise. This can be performed in time complexity O(n), and the result is obtained in point-value form.
The INTT is used to transform the resulting polynomial back to coefficient form.
Using n-th roots of unity, where n is a power of 2, steps 1 and 3 can be performed in O(nlogn) time.
The only limitation is that we must impose a maximum degree of n−1 on the resulting polynomial. Therefore, the sum of the degrees of the two polynomials we want to multiply cannot exceed n−1.
In practice, this is generally not a problem, since we can usually use fields containing sufficiently large n-th roots of unity so that the degrees of the polynomials remain within the allowed bound.
Polynomials of degree less than n−1 do not represent a problem, since they can be padded with zero-valued coefficients up to degree n−1.
Without using the fast NTT and INTT, polynomial multiplication would have time complexity O(n2).
The aim of this chapter is to prove our claim. We will show that multiplying two polynomials in coefficient form is equivalent to applying the NTT to each polynomial, performing pointwise multiplication, and then applying the INTT to the resulting polynomial obtained from this operation.
This result is known as the convolution theorem in the context of polynomial multiplication. In a more general context, the convolution theorem states that
“convolution in the original domain is equivalent to pointwise multiplication in the transformed domain.”
To properly understand this theorem, we need to start by defining what convolution is.
Readers less inclined toward mathematics can skip this chapter without loss of continuity. In the rest of the text, we prove the convolution theorem, but if the reader is only interested in its application, it is sufficient to accept that it can be used to multiply polynomials in time complexity O(nlogn) instead of O(n2).
Convolution
Multiplying two polynomials in coefficient form is an example of a convolution operation.
Consider two polynomials of degree 2,
f(x)=a0+a1x+a2x2
and
g(x)=b0+b1x+b2x2.
Multiplication in coefficient form yields the polynomial
These coefficients can be compactly expressed by the formula
ck=i=0∑kaibk−i
for k=0,1,2,3,4, which is the degree of the polynomial h(x). Let us examine this for one of the coefficients, c3. For this coefficient, we have
c3=i=0∑3aib3−i=a0b3+a1b2+a2b1+a3b0.
Since both a3 and b3 are zero (because the polynomials f(x) and g(x) have degree 2), this reduces to
c3=a1b2+a2b1,
as expected.
The operation defined by
ck=i=0∑kaibk−i
is called a convolution and is usually written as
ck=(a∗b)k,
where ∗ denotes the convolution operator.
The Convolution Theorem
Our goal is to prove that converting f(x) and g(x) to point-value form, multiplying the corresponding points, and converting the result back to coefficient form is equivalent to performing the convolution of the coefficients of f(x) and g(x).
Consider the polynomials
f(x)=a0+a1x+a2x2+...+apxp
and
g(x)=b0+b1x+b2x2+...+bqxq
of degree p and q.
Multiplying f(x) and g(x) produces a new polynomial h(x) whose degree is the sum of the degrees p and q of f(x) and g(x), respectively.
Suppose that the degree of h(x) is n−1.
Thus, we want to compute
h(x)=f(x)g(x)=c0+c1x+c2x2+…+cn−1xn−1
If the degree is less than n−1, we can pad the coefficients of the higher-degree terms with zeros.
Let us represent the coefficients of the polynomials f(x) and g(x) as
[a0,a1,a2,...,an−1],[b0,b1,b2,...,bn−1].
Some of the higher-degree coefficients above will be zero, as the degrees of f(x) and g(x) must sum to at most n−1. This is not a problem. Our only restriction is that
deg(f)+deg(g)=deg(h)≤n−1.
Our first step is to convert the polynomials f(x) and g(x) from coefficient form to point-value form.
First step: convert the polynomials to point-value form.
To convert these polynomials to point-value form, we use the NTT. By choosing n to be a power of 2, we can apply the fast transform. However, since the final result is equivalent to multiplication by the Vandermonde matrix, we will present the matrix formulation.
For instance, for the polynomial f(x), its evaluations at the n-th roots of unity are given by
Using the expressions for f(ωi) and g(ωi) obtained in the previous step, we obtain
h(ωi)=(r=0∑n−1arωri)(s=0∑n−1bsωsi).
To recap, what we want to show is that if we perform the INTT on the polynomial h(x) in its point-value form (the above form), the result is the same as performing the convolution of f(x) and g(x).
Last step: apply the INTT to the polynomial h(x) in point-value form
The Inverse Number Theoretic Transform over the n-th roots of unity is performed using the Vandermonde matrix V(ω−1) scaled by a factor of n1.
Thus, by applying the INTT to the set of points h(ωi), we obtain the polynomial h(x) in coefficient form:
cp=n1i=0∑n−1(r=0∑n−1arωri)(s=0∑n−1bsωsi)ω−ip=n1i=0∑n−1r=0∑n−1s=0∑n−1(arbs)ωriωsiω−ip=n1i=0∑n−1r=0∑n−1s=0∑n−1(arbs)ωi(r+s−p)=n1r=0∑n−1s=0∑n−1(arbs)(i=0∑n−1ωi(r+s−p))The expression obtained aboveGroup the sums and termsCombine the exponents of ωiIsolate the sum of roots of unity.
The expression above indicates that we can use the orthogonality of the roots of unity to simplify it.
Recall that the orthogonality property of the roots of unity is given by
i=0∑n−1(ωi)r−s=⎩⎨⎧n,0,if r≡s(modn),otherwise.
Using the above formula for the isolated sum of roots of unity in cp, we note that
i=0∑n−1ωi(r+s−p)
is equal to n if s=p−r(modn) (or equivalently r=p−s(modn)), and zero otherwise.
This can be represented as n multiplied by the Kronecker delta:
As a result, the summation over s collapses, and we can replace the summation in bs with the single element bp−r. We obtain
cp=r=0∑n−1arbp−r
This is exactly the convolution formula!
Let us recall what we did:
We applied the NTT to the polynomials f(x) and g(x), converting their coefficient representations into point-value representations, that is, vectors containing their evaluations at the n-th roots of unity:
(f(1),f(ω),…,f(ωn−1))and(g(1),g(ω),…,g(ωn−1))
We multiplied these evaluation values pointwise, meaning that for each root of unity ωi we computed
h(ωi)=f(ωi)g(ωi)
obtaining the point-value representation of the product polynomial h(x).
We applied the INTT to this vector of values
(h(1),h(ω),…,h(ωn−1))
to recover the coefficient representation of h(x).
We showed that these three steps produce exactly the same result as multiplying the polynomials f(x) and g(x), namely, convolving the coefficients of f(x) and g(x).
However, while direct convolution has time complexity O(n2), the procedure above can be executed in O(nlogn) time, yielding the same final polynomial.
This is exactly what the convolution theorem states. In a more formal way, we can write it as follows:
Let f and g be two polynomials in their coefficient form. Let ∗ denote the convolution operation and ⋅ denote multiplication.
Let F(f) denote the application of the NTT to f — that is, while f is the vector of coefficients, F(f) is the vector of evaluations — and F−1(F(f)) denote the application of the INTT to F(f). Then,
f∗g=F−1(F(f)⋅F(g)).
Another way to express the convolution theorem is
F(f∗g)=F(f)⋅F(g).
This is a linear transformation, and it can be understood as follows: convolution in the coefficient domain is equivalent to pointwise multiplication in the point-value domain.
This article is part of a series on the Number Theoretic Transform in our ZK Book