made threadsafe

This commit is contained in:
Mark Borgerding
2010-05-27 22:54:01 -04:00
parent 583019e074
commit 57925fd126
9 changed files with 70 additions and 68 deletions

22
README
View File

@ -36,8 +36,8 @@ Code definitions for 1d complex FFTs are in kiss_fft.c.
You can do other cool stuff with the extras you'll find in tools/
* multi-dimensional FFTs
* real-optimized FFTs
* fast convolution FIR filtering
* real-optimized FFTs (returns the positive half-spectrum: (nfft/2+1) complex frequency bins)
* fast convolution FIR filtering (not available for fixed point)
* spectrum image creation
The core fft and most tools/ code can be compiled to use float, double
@ -59,7 +59,7 @@ During this process, I learned:
1. FFT_BRANDX has more than 100K lines of code. The core of kiss_fft is about 500 lines (cpx 1-d).
2. It took me an embarrassingly long time to get FFT_BRANDX working.
3. A simple program using FFT_BRANDX is 522KB. A similar program using kiss_fft is 18KB.
3. A simple program using FFT_BRANDX is 522KB. A similar program using kiss_fft is 18KB (without optimizing for size).
4. FFT_BRANDX is roughly twice as fast as KISS FFT in default mode.
It is wonderful that free, highly optimized libraries like FFT_BRANDX exist.
@ -78,6 +78,11 @@ FREQUENTLY ASKED QUESTIONS:
2) mixed build environment -- all code must be compiled with same preprocessor
definitions for FIXED_POINT and kiss_fft_scalar
Q: Will you write/debug my code for me?
A: Probably not unless you pay me. I am happy to answer pointed and topical questions, but
I may refer you to a book, a forum, or some other resource.
PERFORMANCE:
(on Athlon XP 2100+, with gcc 2.96, float data type)
@ -92,7 +97,10 @@ DO NOT:
UNDER THE HOOD:
Kiss FFT uses a time decimation, mixed-radix, out-of-place FFT.
Kiss FFT uses a time decimation, mixed-radix, out-of-place FFT. If you give it an input buffer
and output buffer that are the same, a temporary buffer will be created to hold the data.
No static data is used. The core routines of kiss_fft are thread-safe (but not all of the tools directory).
No scaling is done for the floating point version (for speed).
Scaling is done both ways for the fixed-point version (for overflow prevention).
@ -100,7 +108,8 @@ UNDER THE HOOD:
Optimized butterflies are used for factors 2,3,4, and 5.
The real (i.e. not complex) optimization code only works for even length ffts. It does two half-length
FFTs in parallel (packed into real&imag), and then combines them via twiddling.
FFTs in parallel (packed into real&imag), and then combines them via twiddling. The result is
nfft/2+1 complex frequency bins from DC to Nyquist. If you don't know what this means, search the web.
The fast convolution filtering uses the overlap-scrap method, slightly
modified to put the scrap at the tail.
@ -111,6 +120,9 @@ LICENSE:
Note this license is compatible with GPL at one end of the spectrum and closed, commercial software at
the other end. See http://www.fsf.org/licensing/licenses
A commercial license is available which removes the requirement for attribution. Contact me for details.
TODO:
*) Add real optimization for odd length FFTs
*) Document/revisit the input/output fft scaling