Commit Graph

377 Commits

Author SHA1 Message Date
Mark Borgerding
bd23fe8d23 the path I was taking would only work for prime numbers (Galois fields) 2003-11-08 01:42:15 +00:00
Mark Borgerding
e98f9ff29a going to bed 2003-11-07 03:42:14 +00:00
Mark Borgerding
ae305ca400 slightly faster 2003-11-07 03:31:59 +00:00
Mark Borgerding
3a5791f203 slightly faster 2003-11-07 03:23:19 +00:00
Mark Borgerding
1486d89291 faster 2003-11-07 03:04:52 +00:00
Mark Borgerding
e9095a161c generic butterfly slightly slower -- hmmm 2003-11-07 02:39:49 +00:00
Mark Borgerding
a296b09dbf wrong alloc size 2003-11-07 01:06:44 +00:00
Mark Borgerding
cb5312efdc 2d fft seems to work 2003-11-06 03:59:31 +00:00
Mark Borgerding
4c458be5e9 checkpoint -- I don't think I've broken anything (yet) adding 2d fft. 2003-11-04 23:25:49 +00:00
Mark Borgerding
ee3094a0e4 benchmark utilities 2003-11-04 02:11:00 +00:00
Mark Borgerding
4ebf0b5aca aded a CHANGELOG 2003-11-04 02:09:53 +00:00
Mark Borgerding
2788fba0bd added a CHANGELOG 2003-11-04 02:09:48 +00:00
Mark Borgerding
8b4e3bacca minor comments and added some primes 2003-11-04 02:00:01 +00:00
Mark Borgerding
6c8049cc75 slight changes to Makefile 2003-11-04 01:01:37 +00:00
Mark Borgerding
7b4de0aa11 a little faster 2003-11-03 04:30:50 +00:00
Mark Borgerding
ad4ee571aa faster radix5 2003-11-03 04:04:01 +00:00
Mark Borgerding
0403fb3e4a radix 5 a little optimized 2003-11-03 03:48:34 +00:00
Mark Borgerding
3c0c0431e2 radix 5 works, but is 6x slower than fftw 2003-11-03 03:03:16 +00:00
Mark Borgerding
85764e6437 radix 5 doesn't work, but I thik it should.
just a checkpoint commit
2003-11-01 16:48:33 +00:00
Mark Borgerding
8ac63adc77 modified time benchmark to repeat same buffer over and over to avoid IO bottlenecks and get more consistent numbers. 2003-11-01 04:44:50 +00:00
Mark Borgerding
471803ca08 removed unused macro 2003-11-01 04:26:02 +00:00
Mark Borgerding
7b7aefe7c4 moved scratch buffer to stack variable 2003-11-01 03:59:43 +00:00
Mark Borgerding
28551899e2 radix 4 faster 2003-11-01 03:49:53 +00:00
Mark Borgerding
d1df249536 radix3 fixed point now works 2003-10-31 04:01:09 +00:00
Mark Borgerding
b1969544a6 radix 3 still doesn't work for fixed 2003-10-30 03:00:49 +00:00
Mark Borgerding
d4f87befda re-added radix 3 butterfly 2003-10-30 02:02:29 +00:00
Mark Borgerding
ca4c74e07c Woops, one should not test with input of all zeros 2003-10-29 04:29:01 +00:00
Mark Borgerding
97b18f3fef comments 2003-10-27 04:02:11 +00:00
Mark Borgerding
d9fcda04b6 version 0.2 upload to sf 2003-10-26 19:29:36 +00:00
Mark Borgerding
ecb1a76974 added zip creation to tarball make target 2003-10-26 04:25:18 +00:00
Mark Borgerding
1db3d91ee5 getting ready for next release 2003-10-26 04:07:32 +00:00
Mark Borgerding
52b4b9ab5c *** empty log message *** 2003-10-18 01:45:26 +00:00
Mark Borgerding
c239ba2c1c slight code cleanup, comments 2003-10-18 01:39:36 +00:00
Mark Borgerding
bca7fd5151 compiles with -ansi -pedantic 2003-10-18 01:23:34 +00:00
Mark Borgerding
e2470b3a03 *** empty log message *** 2003-10-18 00:33:38 +00:00
Mark Borgerding
a3d3217ae6 *** empty log message *** 2003-10-18 00:32:54 +00:00
Mark Borgerding
6f8bcedc24 radix 3 fixed point still broken 2003-10-17 02:59:32 +00:00
Mark Borgerding
31d4214f44 radix 3 seems to be pretty fast
fixed point broken for some reason
2003-10-17 02:34:22 +00:00
Mark Borgerding
73744b908c check point
fixed does not currently work for radix 3
2003-10-17 01:26:14 +00:00
Mark Borgerding
317f11e66e starting point for radix 3
'make test' output
### testing SNR for  2187 point FFTs
#### DOUBLE
snr_t2f = 292.51
snr_f2t = 304.97
#### FLOAT
snr_t2f = 143.46
snr_f2t = 138.03
#### SHORT
snr_t2f = 49.257
snr_f2t = 16.294

#### timing 10000 x 2187 point FFTs
#### DOUBLE
Elapsed:0:05.05 user:3.60 sys:0.54
#### FLOAT
Elapsed:0:02.41 user:1.85 sys:0.23
#### SHORT
Elapsed:0:04.02 user:3.13 sys:0.08
2003-10-17 00:11:19 +00:00
Mark Borgerding
d6ae498630 took the bitwise and out of the switch case -- may have prevented optimization 2003-10-15 03:45:24 +00:00
Mark Borgerding
5f0efe8f17 pretty happy with radix 2 and radix 4
next up is radix 3, or maybe 5
2003-10-15 03:38:05 +00:00
Mark Borgerding
9504aa79c1 Fixed generic mixed radix butterfly
'make test' output:
### testing SNR for  1024 point FFTs
#### DOUBLE
snr_t2f = 296.95
snr_f2t = 317.25
#### FLOAT
snr_t2f = 147.96
snr_f2t = 145.14
#### SHORT
snr_t2f = 52.414
snr_f2t = 22.438

#### timing 10000 x 1024 point FFTs
#### DOUBLE
Elapsed:0:03.56 user:2.63 sys:0.19
#### FLOAT
Elapsed:0:01.35 user:1.07 sys:0.10
#### SHORT
Elapsed:0:01.70 user:1.37 sys:0.06
2003-10-15 02:52:34 +00:00
Mark Borgerding
0424734e9d radix 4 now about as fast as original version
'make test' output:
### testing SNR for  1024 point FFTs
#### DOUBLE
snr_t2f = 296.78
snr_f2t = 317.11
#### FLOAT
snr_t2f = 145.28
snr_f2t = 143.51
#### SHORT
snr_t2f = 52.409
snr_f2t = 22.174

#### timing 10000 x 1024 point FFTs
#### DOUBLE
Elapsed:0:03.43 user:2.68 sys:0.25
#### FLOAT
Elapsed:0:01.39 user:1.08 sys:0.11
#### SHORT
Elapsed:0:02.01 user:1.39 sys:0.09
2003-10-15 01:52:13 +00:00
Mark Borgerding
f609401471 about to make some changes -- just wanted a checkpoint 2003-10-15 00:05:50 +00:00
Mark Borgerding
2ae7e0f1f2 radix 4 works but slow 2003-10-14 02:47:25 +00:00
Mark Borgerding
6b76490456 Fixed point works
'make test' output:

### testing SNR for  1024 point FFTs
#### DOUBLE
snr_t2f = 296.51
snr_f2t = 315.25
#### FLOAT
snr_t2f = 146.39
snr_f2t = 142.86
#### SHORT
snr_t2f = 58.077
snr_f2t = 27.897

#### timing 10000 x 1024 point FFTs
#### DOUBLE
Elapsed:0:01.53 user:1.06 sys:0.26
#### FLOAT
Elapsed:0:01.29 user:0.98 sys:0.12
#### SHORT
Elapsed:0:02.08 user:1.65 sys:0.03
2003-10-14 01:09:33 +00:00
Mark Borgerding
8460f1f8f5 added optimization for radix 2
### testing SNR for  1024 point FFTs
#### DOUBLE
snr_t2f = 296.29
snr_f2t = 314.48
#### FLOAT
snr_t2f = 146.48
snr_f2t = 143.03
#### SHORT
snr_t2f = -30.269
snr_f2t = -60.442

#### timing 10000 x 1024 point FFTs
#### DOUBLE
Elapsed:0:02.77 user:2.22 sys:0.13
#### FLOAT
Elapsed:0:01.65 user:1.35 sys:0.07
#### SHORT
Elapsed:0:02.44 user:2.00 sys:0.06
2003-10-14 00:38:58 +00:00
Mark Borgerding
0d6d61cfce reduced calling parameters
negligible performane impact
2003-10-11 23:07:16 +00:00
Mark Borgerding
0d44569b3b made one single malloc for all buffers
no noticable performance gain
2003-10-11 23:00:12 +00:00