*** empty log message ***

This commit is contained in:
Mark Borgerding 2004-01-31 16:26:42 +00:00
parent 2921d93dac
commit 75393dc4fa
2 changed files with 9 additions and 7 deletions

12
README
View File

@ -51,12 +51,12 @@ theory straight before working on fixed point issues. In the end, I had a
little bit of code that could be recompiled easily to do ffts with short, float little bit of code that could be recompiled easily to do ffts with short, float
or double (other types should be easy too). or double (other types should be easy too).
Once I got my FFT working, I wanted to get some performance numbers against Once I got my FFT working, I was curious about the speed compared to
a well respected and highly optimized fft library. I don't want to criticize a well respected and highly optimized fft library. I don't want to criticize
this great library, so let's call it FFT_BRANDX. this great library, so let's call it FFT_BRANDX.
During this process, I learned: During this process, I learned:
1. FFT_BRANDX has more than 100K lines of code. KISS has less than 1k. 1. FFT_BRANDX has more than 100K lines of code. The core of kiss_fft is about 500 lines (cpx 1-d).
2. It took me an embarrassingly long time to get FFT_BRANDX working. 2. It took me an embarrassingly long time to get FFT_BRANDX working.
3. A simple program using FFT_BRANDX is 522KB. A similar program using kiss_fft is 18KB. 3. A simple program using FFT_BRANDX is 522KB. A similar program using kiss_fft is 18KB.
4. FFT_BRANDX is roughly twice as fast as KISS FFT. 4. FFT_BRANDX is roughly twice as fast as KISS FFT.
@ -70,9 +70,11 @@ last bit of performance.
PERFORMANCE: PERFORMANCE:
(on Athlon XP 2100+, with gcc 2.96, optimization O3, float data type) (on Athlon XP 2100+, with gcc 2.96, optimization O3, float data type)
Kiss performed 1000 1024-pt ffts in 100 ms of cpu time. Kiss performed 1000 1024-pt cpx ffts in 100 ms of cpu time.
For comparison, it took md5sum 160ms cputime to process the same amount of data For comparison, it took md5sum 160ms cputime to process the same amount of data
Transforming 5 minutes of CD quality audio takes about 1 second (nfft=1024).
DO NOT: DO NOT:
... use Kiss if you need the Fastest Fourier Transform in the World ... use Kiss if you need the Fastest Fourier Transform in the World
... ask me to add features that will bloat the code ... ask me to add features that will bloat the code
@ -90,10 +92,10 @@ No scaling is done. Optimized butterflies are used for factors 2,3,4, and 5.
LICENSE: LICENSE:
BSD, see COPYING for details. Basically, "free to use, give credit where due, no guarantees" BSD, see COPYING for details. Basically, "free to use&change, give credit where due, no guarantees"
TODO: TODO:
*) Add real optimization for odd length FFTs (DST) *) Add real optimization for odd length FFTs (DST?)
*) Add real optimization to the n-dimensional FFT *) Add real optimization to the n-dimensional FFT
*) Add simple windowing function, e.g. Hamming : w(i)=.54-.46*cos(2pi*i/(n-1)) *) Add simple windowing function, e.g. Hamming : w(i)=.54-.46*cos(2pi*i/(n-1))
*) Make the fixed point scaling and bit shifts more easily configurable. *) Make the fixed point scaling and bit shifts more easily configurable.

View File

@ -15,9 +15,9 @@ endif
all: $(FFTUTIL) $(FASTFILT) $(FASTFILTREAL) all: $(FFTUTIL) $(FASTFILT) $(FASTFILTREAL)
CFLAGS=-Wall -O3 -pedantic -march=pentiumpro -ffast-math -fomit-frame-pointer #CFLAGS=-Wall -O3 -pedantic -march=pentiumpro -ffast-math -fomit-frame-pointer
# If the above flags do not work, try the following # If the above flags do not work, try the following
#CFLAGS=-Wall -O3 CFLAGS=-Wall -O3
$(FASTFILTREAL): ../kiss_fft.c kiss_fastfir.c kiss_fftr.c $(FASTFILTREAL): ../kiss_fft.c kiss_fastfir.c kiss_fftr.c
$(CC) -o $@ $(CFLAGS) -I.. $(TYPEFLAGS) -DREAL_FASTFIR -lm $+ -DFAST_FILT_UTIL $(CC) -o $@ $(CFLAGS) -I.. $(TYPEFLAGS) -DREAL_FASTFIR -lm $+ -DFAST_FILT_UTIL