welcome: please sign in

Please enter your password of your account at the remote wiki below.
/!\ You should trust both wikis because the password could be read by the particular administrators.

Clear message
location: ArduinoFFT

Arduino FFT Library

Notes:

About the Arduino FFT Library

The Arduino FFT library is a fast implementation of a standard FFT algorithm which operates on only real data. It can give you up to 256 frequency bins at 16b depth, at a minimum of ~7ms update rate. It is adjustable from 16 to 256 bins, and has several output methods to suit various needs. It can be set to 16b linear, 8b linear, 8b logarithmic, or 8b octave output. All of these different modes are detailed in the read_me file (inside the FFT library folder). Since it takes in 16b, fixed point numbers, it has a noise floor of around -72dB in the low frequencies, and -76dB in the high frequencies. When using the onboard ADC, the ADC's noise floor is on the same order as the FFT's noise floor, giving somewhere between a 9b and 10b SNR (-55dB).

Speed characteristics

Function

run

reorder

window

lin

lin8

log

N

(ms)

(us)

(us)

(us)

(us)*

(us)

256

6.32

412

608

588

470

608

128

2.59

193

304

286

234

290

64

1.02

97

152

145

114

144

32

0.37

41

76

80

59

74

16

0.12

21

37

46

30

39

* Note: the lin8 values are approximate, as they vary a small amount due to SCALE factor. See #define section of the read_me for more detials.

Memory characteristics

Function

run

reorder

window

lin

lin8

log

N

S/F(B)

F(B)

F(B)

S/F(B)

S/F(B)

S/F(B)

256

1k/952

120

512

256/768

128/640

128/256

128

512/448

56

256

128/768

64/640

64/256

64

256/200

28

128

64/768

32/640

32/256

32

128/80

12

64

32/768

16/640

16/256

16

64/24

6

32

16/768

8/640

8/256

S = SRAM, F = Flash

How to Use the FFT Library

Files

Libraries: Arduino1.6.12 and higher (fixed PROGMEM for new avr-gcc, fixed clobber list)

Libraries: Arduino1.05 (fixed PROGMEM for new avr-gcc (thanks forum user kirill9617), added serial example)

Libraries: Arduino1.0 or Arduino-0022 (should work with both)

Installing Libraries

The above files need to be placed in the libraries folder inside of your Arduino sketch directory. After you unzip ArduinoFFT.zip, take the FFT folder and place it in your libraries folder, restart Arduino and load one of the example programs to test out the library.

If you are not certain where the libraries folder is located on your computer, try the following:

PC

Open up the Arduino software, and go to Sketch -> Add File..., and a window will pop up that is your sketch folder. This is usually C:\Documents and Settings\<your user name>\My Documents\Arduino. If you see a libraries folder, put the AudioCodec library in there. If you don't already have one, create the libraries folder in that directory.

Mac

Open up your Arduino sketchbook folder. This is typically /Users/<your user name>/Documents/Arduino, or /Users/<your user name>/Documents/Maple if you are using Maple. If there is not a folder already named libraries, you should create one and place the unzipped AudioCodec library within it.

Implementation Details

For those of you who want to look under the hood, let me give you a guided tour. The speed improvements in this particular implementation are due to 2 things. First, in any FFT, you must multiply the input variables by fixed cosine and sine constants. This is what consumes the most time on the ATmega, as 16b x 16b multiplies take around 18 clock cycles. On the other hand, 16b + 16b adds only take 2 clock cycles. So, its better to add than it is to multiply. As it turns out, a lot of those sine and cosine constants used in the FFT are just 0 or 1, so you don't have to multiply, and can just add. For example, in a 256 point FFT, there are 1024 complex multiplies to be done, of which 382 are either 0 or 1. Thats almost half of them!

The ArduinoFFT checks for those 0 or 1 conditions, and simply does adds instead. as it turns out, those constants occur at regular intervals, and can be easily checked for. The benefits of this sort of approach are limited for larger FFTs. The total savings is (1.5*N - 2) for an N sized FFT, whereas the total number of multiplies is (N/2)*log2(N). This gives a savings ratio of 3/log2(N), which drops as N increases.

The second set of time savings in this implementation comes from using lookup tables to calculate the square roots of the magnitudes. The difficulty in this method is that the input mapping to the lookup table is much, much larger than the actual contents of the lookup table itself. So, to not waste memory space, a compression of the input values must be done. For example, taking the square root of a 16b value has 64k input values which must map down to 256 output values. To have an answer hard coded into memory space for all of those inputs is impossible on the Arduino (and a waste in general). So instead, a linear interpolation of the input space is used, with different slopes for different sections. For 8b linear output, this can be done with no loss of precision with either 3 or 4 linear sections. This means that the input value can be evaluated for which section it lies in, and then the square root fetched, in around 12 clock cycles. This is much less than the usual 150 clock cycles that a standard square root library would require.

The 32b input version is slightly more difficult, as the output mapping space is now 16b (64k), and the linear mapping technique can not compress it any more than that. In this case, a hybrid approach is implemented where the input value is converted to a floating point value with 16b of precision plus 8b of exponent. This can be done very quickly in base 2, and then the above 16b square root lookup table method can be used. If the input compression is done in right shifts of 2, the output value can be reconstructed with left shifts of 1. basically, the exponent is forced to be an even value upon creation, so the square root can return an integer value.

This 32b version is not as precise as a true square root library, but it only takes around 40 clock cycles, compared to 500 for a true square root. This lookup table version only gives an accurate first 8b on the return value, but for the purposes of this FFT, that is good enough. The total bit depth of the FFT is not much past 12b since it is implemented in fixed point (each value must be divided by 2 before adding to prevent overflow - this gives an eventual divide by 256 for a 256 point FFT). The relative accuracy is a function of output value size. For a return value of 8b, it is as close as you can get. For a 9b value, its lsb might be wrong. for a 10b value, 2 lsbs might be wrong, and so on. So the worst case scenario is a 16b return value where you get +/-0.5% accuracy.

References

If you are interested in learning more about the FFT, here are some good resources that i used in writing my code.