fftea 1.0.0+3 fftea: ^1.0.0+3 copied to clipboard
A simple Fast Fourier Transform (FFT) library. Includes a very efficient FFT, plus related tools for frequency analysis.
fftea #
A simple and efficient Fast Fourier Transform (FFT) implementation.
FFT converts a time domain signal to the frequency domain, and back again. This is useful for all sorts of applications:
- Filtering or synthesizing audio
- Compression algorithms such as JPEG and MP3
- Computing a spectrogram (most AI applications that analyze audio use spectrograms)
- Convolutions, such as reverb filters for audio, or blurring filters for images
This library supports FFT of power-of-two sized arrays of real or complex numbers, using the Cooley-Tukey algorithm. It also includes some related utilities, such as windowing functions, STFT, and inverse FFT.
Usage #
Running a basic real-valued FFT:
// myData.length must be a power of two. Eg 2, 4, 8, 16, ...
List<double> myData = ...;
final fft = FFT(myData.length);
final freq = fft.realFft(myData);
freq
is a Float64x2List
representing a list of complex numbers. See
ComplexArray
for helpful extension methods on Float64x2List
.
For efficiency, avoid recreating the FFT
object each time. Instead, create one
FFT
object of whatever size you need, and reuse it.
Running an STFT to calculate a spectrogram:
// audio.length *doesn't* need to be a power of two. It can be any legnth.
List<double> audio = ...;
final chunkSize = 1024; // Must be a power of two.
final stft = STFT(chunkSize, Window.hanning(chunkSize));
final spectrogram = <Float64List>[];
stft.run(audio, (Float64x2List freq) {
spectrogram.add(freq.discardConjugates().magnitudes());
});
The result of a real valued FFT is about half redundant data, so
discardConjugates
removes that data from the result (a common practice for
spectrograms):
[sum term, ...terms..., nyquist term, ...conjugate terms...]
^----- These terms are kept ------^ ^- Discarded -^
Then magnitudes
discards the phase data of the complex numbers and just keeps
the amplitudes, which is usually what you want for a spectrogram.
If you want to know the frequency of one of the elements of the spectrogram, use
stft.frequency(index, samplingFrequency)
, where samplingFrequency
is the
frequency that audio
was recorded at, eg 44100. The maximum frequency (aka
nyquist frequency) of the spectrogram will be
stft.frequency(chunkSize / 2, samplingFrequency)
.
See the example for more detailed usage.
Benchmarks #
This package was built because package:fft is not actively maintained anymore, and wasn't a particularly efficient implementation. There are a few improvements that make this implementation more efficient:
- The main FFT class is constructed with a given size, so that the twiddle factors only have to be calculated once. This is particularly handy for STFT.
- Doesn't use a wrapper class for complex numbers, just uses a Float64x2List. Every little wrapper class is a seperate allocation and dereference. For inner loop code, like FFT's complex number math, this makes a big difference. Float64x2 can also take advantage of SIMD optimisations.
- FFT algorithm is in-place, so no additional arrays are allocated.
- Loop based FFT, rather than recursive.
- Using trigonometric tricks to only calculate a quarter of the twiddle factors.
I found some other promising FFT implementations, so I decided to benchmark them too: scidart, and smart_signal_processing.
To run the benchmarks:
- Go to the bench directory and pub get,
cd bench && dart pub get
- Run
dart run bench.dart
fftea gains some of its speed by caching the twiddle factors between runs, and by doing the FFT in-place. So the benchmarks are broken out into cached and in-place variants. Caching and running in-place is also applied to some of the other libraries, where appropriate.
In the table below, "cached" means the construction of the FFT object is not included in the benchmark time. And "in-place" means using the in-place FFT, and the conversion and copying of the input data into whatever format the FFT wants is not included in the benchmark. Run in Ubuntu on a Dell XPS 13.
Size | package:fft | smart | smart, in-place | scidart | scidart, in-place* | fftea | fftea, cached | fftea, in-place, cached |
---|---|---|---|---|---|---|---|---|
16 | 410.6 us | 28.3 us | 8.3 us | 347.2 us | 164.0 us | 42.2 us | 11.6 us | 10.9 us |
64 | 416.4 us | 8.2 us | 2.7 us | 169.0 us | 194.0 us | 73.5 us | 41.6 us | 41.8 us |
256 | 470.6 us | 14.0 us | 11.0 us | 904.9 us | 812.8 us | 34.1 us | 8.4 us | 6.1 us |
2^10 | 1.54 ms | 47.2 us | 44.1 us | 8.28 ms | 8.26 ms | 36.3 us | 30.5 us | 27.3 us |
2^12 | 7.65 ms | 206.1 us | 197.3 us | 132.99 ms | 121.27 ms | 159.2 us | 143.9 us | 129.3 us |
2^14 | 39.83 ms | 940.1 us | 899.4 us | 1.96 s | 1.96 s | 750.1 us | 695.5 us | 590.2 us |
2^16 | 261.38 ms | 6.04 ms | 5.62 ms | 41.44 s | 41.60 s | 4.71 ms | 5.89 ms | 3.56 ms |
2^18 | 1.31 s | 27.65 ms | 27.84 ms | Skipped | Skipped | 20.80 ms | 24.92 ms | 15.66 ms |
2^20 | 7.29 s | 168.84 ms | 151.12 ms | Skipped | Skipped | 119.93 ms | 106.00 ms | 88.26 ms |
In practice, you usually know how big your FFT is ahead of time, so it's pretty easy to construct your FFT object once, to take advantage of the caching. It's sometimes possible to take advantage of the in-place speed up too, for example if you have to copy your data from another source anyway you may as well construct the flat complex array yourself. Since this isn't always possible, the "fftea, cached" times are probably the most representative. In that case, fftea is about 60-80x faster than package:fft, and about 30% faster than smart_signal_processing. Not sure what's going on with scidart, but it seems to be O(n^2).
* Scidart's FFT doesn't have an in-place mode, but they do use a custom format, so in-place means that the time to convert to that format is not included in the benchmark.