*Notes by Gene Cooperman, © 2009
(may be freely copied as long as this copyright notice remains)*

We all know how to multiply polynomials by multiplying all the cross
terms. Given a polynomial `P`

of degree `d`

(the largest power is
`d`

), and a polynomial `Q`

of degree `e`

(the largest power is `e`

),
the product, `P Q`

, will have degree `d e`

. To compute the product
`P Q`

, there will be up to `(d+1) (e+1)`

cross terms to multiply.

In general, if two polynomials are both of degree `d`

, then there
will be `(d+1)`

cross terms.^{2} = O(d^{2})

But using the idea of the fast Fourier transform, one can actually reduce
the number of multiplications from `O(d`

to ^{2})`O(d log d)`

.
multiply the polynomials in `O(d log d)`

time. Let `A(x)`

and `B(x)`

be polynomials of degree `d`

. For example, if `d=3`

, we
might have `A(x) = 4 + 3x + 5x`

.^{2} + 6x^{3}

In outline, the idea for polynomial multiplication for `C(x) = A(x) B(x)`

of degree `d`

is:

- Choose
`d`

points`x`

and_{0},…, x_{d/2}`x`

._{-1},…, x_{-d/2} *Evaluation:*`A(x`

and_{0}),…,A(x_{d/2}`A(-x`

; Do the same for_{1}),…,A(-x_{d/2})`B(x)`

*Multiplication:*Compute`C(x)=A(x)B(x)`

at each of the points.

So`C(x`

and_{i}) = A(x_{i}) B(x_{i})`C(x`

. Let_{-i}) = A(x_{-i}) B(x_{-i})`y`

be the constants such that_{i}`C(x`

and_{-i}) = y_{i}`C(x`

_{-i}) = y_{-i}*Interpolation:*Compute the unknown coefficients of`C(x)`

using`C(x`

, and_{0}) = y_{0}`C(x`

and_{1}) = y_{1},…,C(x_{d/2}) = y_{d/2}`C(x`

._{-1}) = y_{-1},…,C(x_{-d/2}) = y_{-d/2}

We choose `d+1`

points
`x`

at which we want to evaluate _{0}, x_{1},…,x_{d}`A(x)`

, where

`x`

._{d/2+i}= -x_{i}

Let's see how to do step 2 the fast way. The polynomial `A(x)`

can be split
into even and odd terms (for `d`

even):

`A(x) = (c`

_{0} + c_{2}x^{2} +…+ c_{d/2}x^{d/2}) + (c_{1}x + c_{3}x^{3} +…+ c_{d/2-1}x^{d/2-1})

We can write this as:

`A(x) = A`

_{even}(x) + A_{odd}(x)

where

`A`

_{even}(x) = c_{0}+ c_{2}x^{2}+…+ c_{d/2}x^{d/2}

`A`

_{odd}(x) = c_{1}x + c_{3}x^{3}+…+ c_{d/2-1}x^{d/2-1}

(Note that our `A`

and A_{even}(x)_{odd}(x) are different
from the textbook's `A`

and _{e}(x)`A`

. Specifically,
_{o}(x)`A`

and
_{e}(x^{2}) = A_{even}(x)`xA`

.)_{o}(x^{2}) = A_{odd}(x)

Next, we compute `A(x`

_{i}) = A_{even}(x) + A_{odd}(x)

and `A(-x`

_{i}) = A_{even}(-x) + A_{odd}(-x) = A_{even}(x) - A_{odd}(x)

Recall that we chose the `d+1`

points
`x`

at which we want to evaluate _{0}, x_{1},…,x_{d}`A(x)`

, where

`x`

._{d/2+i}= -x_{i}

So, we do step 2 by evaluating
`A`

and _{even}(x_{i})`A`

for _{odd}(x_{i})`x`

.
After this, we achieve evaluations for _{0}, s_{1},…, x_{d/2}`x`

_{d/2+i}`free:

```
```
`A`_{even}(x_{d/2+i}) = A_{even}(-x_{i}) =
A_{even}(x_{i})

`A`_{odd}(x_{d/2+i}) = A_{odd}(-x_{i}) =
-A_{even}(x_{i})

Then, we have `d+1`

evaluations of `A(x)`

,
since `A(x) = A`_{even}(x) + A_{even}(x)

Evaluating the `d+1`

terms of `A()`

at all `d+1`

points would have
required evaluating
`(d+1) (d+1)`

terms. By doing it this alternative way, we only need
to evaluate about `(d+1) (1+d/2)`

terms.

Do likewise for another polynomial `B()`

.

## Step 3 (Multiplication):

Multiply at the `d+1`

points `x`_{i}

.

C(x_{i}) = A(x_{i}) B(x_{i})

When we are done, we have computed constants `y`_{i}

for which:

`C(x`

_{i}) = y_{i}

This requires `d+1`

multiplications. Our goal is to
find a fast multiplication in time `O(d log d)`

. This step is `O(d)`

and so it's already fast enough.

In general, interpolation requires a lot of steps. Given the unknown
polynomial `C(x)`

for which we only know:

`C(x`

_{0}) = y_{0},…, C(x_{d}) = y_{d}

we wish to find `c`

such that _{0},…, c_{d}

`C(x) = c`

_{0} + c_{1}x + c_{2}x^{2} +…+ c_{d}x^{d}

As you would expect, one plugs in and solves.
The `c`

are the unknowns that we have to solve for.
After plugging into _{i}`C(x) = y`

for `x`

,
we have _{0}, x_{1},…,
x_{d}`d+1`

linear equations:

`c`

_{0}+ c_{1}(x_{0}) + c_{2}(x_{0})^{2}+…+ c_{d}(x_{0})^{d}= y_{0}

`c`

_{0}+ c_{1}(x_{1}) + c_{2}(x_{1})^{2}+…+ c_{d}(x_{1})^{d}= y_{1}

...

This means solving the linear simultaneous equations.

The advantage if we use
`x`

and
and _{1},…, x_{d/2}`x`

is that we can replace
the equations to solve: _{-1},…, x_{-d/2}

`C(x`

and _{i})=y_{i}`C(-x`

_{i})=y_{-i}

by:

`C(x`

_{i}) + C(-x_{i}) = y_{i} + y_{-i}

`C(x`

_{i}) - C(-x_{i}) = y_{i} - y_{-i}

These last two equations are the same as:

`2 C`

_{even}(x_{i}) = y_{i}+ y_{-i}

`2 C`

_{odd}(x_{i}) = y_{i}+ y_{-i}

Now, we again plug in, but this time we have only half as many unknowns,
c_{i}, in each equation:

`2c`

_{0} + 2c_{2}(x_{i})^{2} +…+ 2c_{d-1}(x_{i})^{d-1} = y_{i} + y_{-i}

`2c`

_{1}x + 2c_{3}(x_{i})^{3} +…+ 2c_{d}(x_{i})^{d} = y_{i} - y_{-i}

So, we still have `d+1`

linear equations to solve, but
each linear equation now has only half as many terms.

We have seen that the polynomial `A`

has only half
as many terms. So, we win by breaking up
_{even}(x)`A(x) = A`

.
How can we extend this further?_{odd}(x) + A_{odd}(x)

The answer is to go to complex numbers. Recall that the imaginary
number `i`

satisfies `i`

. Some simple algebra then
shows that ^{2} = -1`i`

and ^{3} = -i`i`

.
^{4} = 1*(Note that we will also continue to use i for the subscript index.
It will be clear from context whether we mean i the imaginary number,
or i the subscript index.)*

So, instead of using
`A(x)`

and `A(-x)`

for `A()`

and replacing them by `A(x) + A(-x)`

and
`A(x) - A(-x)`

for `A()`

,
let's use `P(x), P(ix), P(i`

(or the equivalent ^{2}x), P(i^{3}x)`P(x), P(ix), P(-x), P(-ix)`

) for `P()`

. Let's solve for
some polynomials below.
From these, we won't construct `2P`

and _{even}(x) = P(x) + P(-x)`2P`

,
but four polynomials based on four linear combinations.
We'll guess the right polynomials below._{odd}(x) = P(x) - P(-x)

`P(x)=1` |
`P(x)=x` |
`P(x)=x` |
`P(x)=x` |
`P(x)=x` |
Short Name for Expression | |
---|---|---|---|---|---|---|

`P(x) + P(ix) + P(-x) + P(-ix)` : |
4 | 0 | 0 | 0 | 4x^{4} |
4P_{0 mod 4}(x) |

`P(x) - P(ix) + P(-x) - P(-ix)` : |
0 | 0 | 4x^{2} |
0 | 0 | 4P_{2 mod 4}(x) |

`P(x) + P(ix) - P(-x) - P(-ix)` : |
0 | (2+2i)x | 0 | (2-2i)x^{3} |
0 | |

`P(x) - P(ix) - P(-x) + P(-ix)` : |
0 | (2-2i)x | 0 | (2+2i)x^{3} |
0 | |

`P(x) + iP(ix) - P(-x) - iP(-ix)` : |
0 | 0 | 0 | 4x^{3} |
0 | 4P_{3 mod 4}(x) |

`P(x) - iP(ix) - P(-x) + iP(-ix)` : |
0 | 4x | 0 | 0 | 0 | 4P_{1 mod 4}(x) |

The *good* polynomials are:

`4P`

_{0 mod 4}(x) = P(x) + P(ix) + P(-x) + P(-ix)

`4P`

_{1 mod 4}(x) = P(x) - iP(ix) - P(-x) + iP(-ix)

`4P`

_{2 mod 4}(x) = P(x) - P(ix) + P(-x) - P(-ix)

`4P`

_{3 mod 4}(x) = P(x) + iP(ix) - P(-x) - iP(-ix)

The subscript names were chosen to indicate which terms are non-zero.
For example `P`

has terms identical to _{1 mod 4}()`P()`

for those terms of degree `1 mod 4`

.
The other terms of `P`

are all zero.
These are the polynomials for which 3/4 of the terms are zero. So, these
polynomials can be evaluated in 1/4 of the time._{1 mod 4}()

Note that

`P(x) = P`

_{0 mod 4}(x) + P_{1 mod 4}(x) + P_{2 mod 4}(x) + P_{3 mod 4}(x)

If we're going to make this work, we need an efficient way to convert
between P(x), P(ix), P(-x), P(-ix) and the four polynomials above.
This is the essence of the *Fast Fourier Transform* in the text.
It allows you to convert from from an ordinary polynomial `P()`

into the four transformed polynomials above. (This is analogous
to the previous case in which we transformed into `A`

and _{even}()`A`

.) Using the transformed
polynomials, we can evaluate (Step 2) fast. Step 3 is
always fast. And we can interpolate (Step 4) fast._{odd}()

We choose `d+1`

points
`x`

at which we want to evaluate _{0}, x_{1},…,x_{d}`A(x)`

, where

`x`

for_{d/4+i}= i x_{i}`1≤i<d/2`

`x`

for_{2d/4+i}= - x_{i}`1≤i<d/2`

`x`

for_{3d/4+i}= -i x_{i}`1≤i<d/2`

Note that the coefficients `i`

, -1, and -i are the powers of the imaginary
number `i`

.

We can write this as:

`P(x) = P`

_{0 mod 4}(x) + P_{1 mod 4}(x) + P_{1 mod 4}(x) + P_{2 mod 4}(x) + P_{3 mod 4}(x)

where

`P`

_{0 mod 4}(x) = c_{0}+ c_{4}x^{4}+ …

`P`

_{1 mod 4}(x) = c_{1}x + c_{5}x^{5}+ …

...

So, we do step 2 by evaluating
`P`

, _{0 mod 4}(x_{i})`P`

,
_{1 mod 4}(x_{i})`P`

and _{2 mod 4}(x_{i})`P`

,
for _{3 mod 4}(x_{i})`x`

.
After this, we achieve evaluations of _{0}, s_{1},…, x_{d/4}`P`

with _{0 mod 4}(x_{j})`j>d/4`

for free:

`P`

_{0 mod 4}(x_{d/4+i}) = P_{0 mod 4}(ix_{i}) = P_{0 mod 4}(x_{i})

`P`

_{0 mod 4}(x_{2d/4+i}) = P_{0 mod 4}(-x_{i}) = P_{0 mod 4}(x_{i})

`P`

_{0 mod 4}(x_{3d/4+i}) = P_{0 mod 4}(-ix_{i}) = P_{0 mod 4}(x_{i})

and evaluations of `P`

with _{1 mod 4}(x_{j})`j>d/4`

for free:

`P`

_{1 mod 4}(x_{d/4+i}) = P_{1 mod 4}(ix_{i}) = i P_{1 mod 4}(x_{i})

`P`

_{1 mod 4}(x_{2d/4+i}) = P_{1 mod 4}(-x_{i}) = - P_{1 mod 4}(x_{i})

`P`

_{1 mod 4}(x_{3d/4+i}) = P_{1 mod 4}(-ix_{i}) = -i P_{1 mod 4}(x_{i})

and evaluations of `P`

with _{2 mod 4}(x_{j})`j>d/4`

for free:

`P`

_{2 mod 4}(x_{d/4+i}) = P_{2 mod 4}(ix_{i}) = - P_{2 mod 4}(x_{i})

`P`

_{2 mod 4}(x_{2d/4+i}) = P_{2 mod 4}(-x_{i}) = P_{2 mod 4}(x_{i})

`P`

_{2 mod 4}(x_{3d/4+i}) = P_{2 mod 4}(-ix_{i}) = - P_{2 mod 4}(x_{i})

and evaluations of `P`

with _{3 mod 4}(x_{j})`j>d/4`

for free:

`P`

_{3 mod 4}(x_{d/4+i}) = P_{3 mod 4}(ix_{i}) = -i P_{3 mod 4}(x_{i})

`P`

_{3 mod 4}(x_{2d/4+i}) = P_{3 mod 4}(-x_{i}) = - P_{3 mod 4}(x_{i})

`P`

_{3 mod 4}(x_{3d/4+i}) = P_{3 mod 4}(-ix_{i}) = i P_{3 mod 4}(x_{i})

Then `P(x)`

is computed as:

`P(x) = P`

_{0 mod 4}(x) + P_{1 mod 4}(x) + P_{1 mod 4}(x) + P_{2 mod 4}(x) + P_{3 mod 4}(x)

`P`

, _{0 mod 4}()`P`

, _{1 mod 4}()`P`

and _{2 mod 4}()`P`

can each be evaluated fast, since each has only 1/4 as many terms.
Each of the four functions is applied to each of _{3 mod 4}()`x`

.
So, this requires an evaluation of _{0}, x_{1},…, x_{d/4}`(d+1) (d/4)`

terms.

Do likewise for another polynomial, `Q(x)`

.

Multiply at the `d+1`

points `x`

._{i}

`R(x`

_{i}) = P(x_{i}) Q(x_{i})

When we are done, we have computed constants `y`

for which:_{i}

`R(x`

_{i}) = y_{i}

This requires `d+1`

multiplications. Our goal is to
find a fast multiplication in time `O(d log d)`

. This step is `O(d)`

and so it's already fast enough.

Recall that with the interpolation phase of
`A`

and _{even}()`A`

, we needed to determine
the unknown coefficients _{odd}()`c`

for the polynomial _{0}, c_{1},…,c_{d}`C(x)`

, given that:

`C(x`

_{i}) = y_{i}

We again need to do again find unknown coefficients
`c`

, but now for:_{0}, c_{1},…,c_{d}

`R(x`

_{i}) = y_{i}

We use the *good* polynomials found from the table at the
beginning of this subsection:

`4R`

_{0 mod 4}(x) = R(x) + R(ix) + R(-x) + R(-ix)

`4R`

_{1 mod 4}(x) = R(x) - iR(ix) - R(-x) + iR(-ix)

`4R`

_{2 mod 4}(x) = R(x) - R(ix) + R(-x) - R(-ix)

`4R`

_{3 mod 4}(x) = R(x) + iR(ix) - R(-x) - iR(-ix)

We expand:

`4R`

_{0 mod 4}(x_{i})`= R(x`

_{i}) + R(ix_{i}) + R(-x_{i}) + R(-ix_{i})`= R(x`

_{i}) + R(x_{d/4+i}) + R(x_{2d/4+i}) + R(x_{3d/4+i})`= y`

_{i}+ y_{d/4+i}+ y_{2d/4+i}+ y_{3d/4+i}`4R`

_{1 mod 4}(x)`= R(x`

_{i}) - iR(ix_{i}) - R(-x_{i}) + iR(-ix_{i})`= y`

_{i}- iy_{d/4+i}- y_{2d/4+i}+ iy_{3d/4+i}`4R`

_{2 mod 4}(x_{i})= R(x _{i}) - R(ix_{i}) + R(-x_{i}) - R(-ix_{i})`= y`

_{i}- y_{d/4+i}+ y_{2d/4+i}- y_{3d/4+i}`4R`

_{3 mod 4}(x_{i})`= R(x`

_{i}) + iR(ix_{i}) - R(-x_{i}) - iR(-ix_{i})`= y`

_{i}+ iy_{d/4+i}- y_{2d/4+i}- iy_{3d/4+i}

So, we have to find the unknowns, `c`

:_{i}

`c`

_{0}+ c_{4}x_{i}^{4}+ … = y_{i}+ y_{d/4+i}+ y_{2d/4+i}+ y_{3d/4+i}

`c`

_{1}x_{i}+ c_{5}x_{i}^{5}+ … = y_{i}- iy_{d/4+i}- y_{2d/4+i}+ iy_{3d/4+i}

`c`

_{2}x_{i}+ c_{6}x_{i}^{6}+ … = y_{i}- y_{d/4+i}+ y_{2d/4+i}- y_{3d/4+i}

`c`

_{3}x_{i}+ c_{7}x_{i}^{7}+ … = y_{i}+ iy_{d/4+i}- y_{2d/4+i}- iy_{3d/4+i}

But when we do this step, each equation already has 3/4 of the
coefficients set to zero, and so each of the `d+1`

linear equations will
have only 1/4 as many terms.

Furthermore, there are savings in computing the sums of the `y`

on the right hand side, since many of the partial sums are repeated.
In the general case, there will be only _{i}`O(d log d)`

sums of the
`y`

._{i}

From the previous two subsections (case of `A`

,
and case of _{even}(), A_{odd}()`P`

), we can derive
the general pattern._{0 mod 4}(), P_{1 mod 4}(),
P_{2 mod 4}(), P_{3 mod 4}()

The textbook does this in a somewhat more abstract (and more compact)
manner, by using recursion. Now read the general case there.
In converting between our notation and theirs, it's important to
recognize that the textbook uses `A`

and _{e}()`A`

,
where:_{o}()

`A`

, and_{even}(x) = A_{e}(x^{2})

`A`

._{odd}(x) = xA_{o}(x^{2})

In general, if the polynomial is of degree `d = n-1`

for some
`n = 2`

, then we let ω = e^{k}^{2πin/k} and choose
`x`

: _{i} = ω^{i}

`x`

_{0}= 1

`x`

_{1}= ω = e^{2πi(n/k)}

`x`

_{2}= ω^{2}= e^{2πi(2n/k)}

`x`

_{3}= ω^{3}= e^{2πi(3n/k)}

...

For further reading about fast multiplication (both for integers and polynomials), try http://en.wikipedia.org/wiki/Multiplication_algorithm.