FFTBased Integer Multiplication, Part 1
The fast Fourier transform allows us to convolve in n log n time, which is useful for fast multiplication. But the FFT uses floatingpoint calculations; what problems does this cause when we want exact integer results?
October 19, 2018
The convolution of two discrete, finite, length signals and (basically two length arrays) is denoted . The result is also a length signal, with
Convolutions appear in many applications:
 Multiplication: Let and be the coefficient arrays of two polynomials, so is the coefficient for the term. Then equals the coefficient on the term when we multiply our polynomials (plus a wrapping effect, discussed later). Similarly, since integers can be expressed as polynomials ( at ) we can multiply integers by convolving their digits.
 Probability: the PDF of the sum of two independent random variables is equal to the convolution of the two variables’ individual PDFs.
 Computer graphics: we can also convolve 2D signals, and many effects can be achieved simply by convolving an image with a certain filter. Examples: box blur to blur an image, a Lanczos filter to downsize an image, and the Canny edge detector.
One technicality is that in our definition of , the index can potentially be negative, which we get around by assuming that “wraps around,” so for example . This wrapping behavior is often undesired (for example with polynomial multiplication, it leads to the coefficients of and being added together) but we can get rid of it by padding and with sufficient zeroes.
It’s easy to see that computing naively is an operation–the result has elements, and each element takes operations to compute. This is where the discrete Fourier transform (DFT) comes in. The DFT of returns an array , also of length . The DFT is defined such that the continuous function has the property that for ; it expresses as a weighted sum of sinusoidal functions (for reasons I’m not going to explain here, any can be uniquely expressed like this). Since the values in tell us how much to weight the different frequencies sinusoids, represents our signal in what’s called the frequency domain.
The DFT is useful because by the convolution theorem, convolving in the time domain is equivalent to pointwise multiplication in the frequency domain. Furthermore, the fast Fourier transform algorithm allows us to compute the DFT and inverse DFT in time. So, here’s an asymptotically superior way to compute :
 Use FFT to compute and . Time:
 By the convolution theorem, so we pointwise multiply our results from step 1. Time:
 Use FFT to apply the inverse DFT to our result from the previous step. Time:
Which gives us overall. This article (and the next part of it) will look at applying this technique to large integer multiplication. Computers have builtin instructions to multiply integers, but their inputs are limited by machine word size, so generally inputs must be below . To multiply larger numbers (for example RSA commonly uses 2048bit integers), we need to implement multiplication ourselves in software. FFTbased multiplication has high overhead but the best known asymptotic complexity, so it’s used to multiply very large integers (at least tens of thousands of bits).
Floating point problems
For a length signal ,
where . This means the DFT inherently involves floatingpoint arithmetic, since ; trig implies floating point. This is very bad because we’re trying to implement exact integer arithmetic, but floatingpoint numbers become inexact in hardtoanalyze ways. Integers also have finite precision, but they’re much easier to analyze; an unsigned bit integer can represent anything in . Floatingpoint numbers are stored by computers in binary scientific notation; they’re expressed as where is the sign bit, is a binary decimal between 0 and 1, and is the exponent. The use of scientific notation makes it very hard to reason about which numbers are representable exactly, and which aren’t. For example, and can be stored exactly, but cannot.
Order of operations also matters much more with floating point numbers. Even if the inputs and the output can be stored exactly, certain operation orderings will make intermediate results, and therefore the final result, inexact. Here’s a simple example:


We are computing with doubles in two ways; you can see the output of the code here. Note that even though r1
and r2
are doing the same math, only r2
gets the final answer right. This is because even though is exactly representable, r1
's intermediates , , and are not. Admittedly, integers have a flavor of this problem too; won’t overflow a 32bit integer but will. Still, integers are far more easy to reason about. What’s easier to determine:
 (64bit integers) are all intermediate values in ?
 (Doubles) If we take all intermediate values and view them as binary decimals, are we sure that there are at most 51 bits between mostsignificant bit and least significant bit, exclusive?
This is why analyzing the floatingpoint error from FFT results in awfullooking bounds like this (taken from Modern Computer Algebra, equation 3.2):
where:
 (the true result)
 is the result computed with floating point
 are of length
 for whatever reason and when using doubles
So if we keep the righthand side of the inequality below 0.5 and round to the nearest integer, things should theoretically work out fine. I have no idea how this bound was derived, but there are probably a lot of pitfalls to watch out for:
 The bound probably assumes certain floatingpoint behaviors like those specified in IEEE754. Certain processors, especially lowerpower ones, often aren’t fully IEEEcompliant.
 The naive, DFT formula is often used as a base case for FFT because the naive method tends to be faster for small . However, the naive method has worse numerical precision characteristics, so the inequality must be modified if this optimization is chosen.
 This bound probably assumes a certain order of operations which the implementor must be careful to follow, and the compiler must preserve.
Pitfalls like this probably make it a nightmare to maintain an integer multiplication algorithm based on floatingpoint FFT across different platforms and compilers. This is why GMP, the big integer library used by Mathematica and Maple, doesn’t use floatingpoint FFT. It instead uses the number theoretic transform (NTT), which is essentially an FFT over modular arithmetic instead of complex roots of unity. The second part of this post will go over NTT in detail.
It is worth noting though that in many applications, floatingpoint error is fairly acceptable, because the original input signal wasn’t even exact to begin with (ex. FFT on microphone input). FFT is also generally simpler to implement than NTT, and FFT’s precision problems don’t manifest themselves until inputs get fairly large, so FFT is sometimes still used in competitive programming even when exact integer output is desired.