This gets brought up quite often here but something people don't talk about is why Fourier needed to do this. Historical context is really fun! In the late 1800s, Fourier wanted to mathematically describe how heat diffuses through solids, aiming to predict how temperature changes over time, such as in a heated metal rod. Observing that temperature variations evolve smoothly, he drew inspiration from the vibrating string problem studied by Euler and D’Alembert, where any complex motion could be expressed as a sum of simple sine waves. Fourier hypothesized that heat distribution might follow a similar principle; that any initial temperature pattern could be decomposed into basic sinusoidal modes, each evolving independently as heat diffused.
And it is an important correction because Fourier was immediately able to use his technique to solve partial differential equations, but it was many decades later before it was shown how it all works with the rigorous foundation of measure theory and functional analysis.
I generally agree with the point of the article ("Fourier transform is not magical").
However saying it is "just" curve fitting with sinusoids fails to mention that, among an infinite number of basis functions, there are some with useful properties, and sinusoids are one such: they are eigenvectors of shift-invariant linear systems (and hence are also eigenvectors of derivative operators).
I really don't get the point the article is making. I think the whole point about curve fitting is really a distraction, they could have simply stated that the FFT has periodic boundary conditions, so if you take the FFT of something that only extends a finite time of your sampling window, you will see your delta functions in the frequency domain spaced by the inverse of the length of your sampling window, i.e. the FFT "sees" your finite window as a pulse train. That's well known and a fundamental aspect of Fourier transforms.
But then the statements about the discontinuous "vibrations". E.g. in the case of the 1 Hz cycle over half the window the author states that:
> Yet the FFT of this data is also very complex. Again, there are many harmonics with energy. They indicate that the signal contains vibrations at 0.5Hz, 1.0Hz, 1.5Hz, etc. But the time signal clearly shows that the 'vibration' was only at 1Hz, and only for the first second.
The implication that there is a vibration only at 1Hz is plain wrong. To have a vibration abruptly stop, you need many frequencies (in general the shorter a feature in the time domain the more frequency components you need in the frequency domain). If we compare for example a sine wave with a square wave at the same frequency, the square wave will have many more frequency components in the Fourier domain (it's a sinc envelope of delta functions spaced at the frequency of the wave in fact). That's essentially what is done in the example the sine wave is multiplied by a square wave with half the frequency (similar things apply to the other examples). Saying only the fundamental frequency matters is just wrong.
This is also not just a "feature of the fitting to sines", it's fundamental and has real world implications. The reason why we e.g. see ringing on an oscilloscope trace of a square wave input is because the underlying analog system has a finite bandwidth, so we "cut-off"/attenuate higher frequency components, so the square wave does not have enough of those higher frequencies (which are irrelevant according to the author) to represent the full square wave.
FFT is simply an algorithm to efficiently compute the DFT. The fact that the article makes no mention of DFT vs fourier series vs DTFT is going to end up creating more confusion that it solves. For some reason introductory tutorials always start with the DFT (usually mistakenly using FFT and DFT interchangeably), even though to me the continuous fourier transform is far easier to conceptually understand. Going from continuous fourier transform to the DTFT, is just applying the FT to a dirac-combed (sampled) function. Then from DTFT to DFT you introduce periodic boundary condition. Fourier series is just applying FT to a function that happens to already be periodic, resulting in a finite set of discrete frequencies.
There is a connection between fourier series and DFT in that if the fourier series is computed for the periodic resummation of a signal, and then the DFT is computed for the original signal (which implicitly involves applying a periodic boundary condition), the DFT is just the periodic resummation of the fourier series.
Completely agree the transition between continuous and discrete domains is often glossed over and people use DFT, FT, DTFT and FFT almost interchangeably (I certainly have been guilty of that myself, and you are correct the FFT and DFT are equivalent for this discussion).
An interesting fact (somewhat related to your mentioning of the DTFT) is that one can consider the DFT as a filter with a sinc transfer function. That's essentially how you can understand the spectrum of an OFDM signal. You perform a block based FFT on your input bit/symbol stream, so you have waves at different carriers. However, because the stream is timevarying you essentially get sinc shaped spectra spaced at the symbol rate (excluding cycling prefixes etc.). So your OFDM spectrum is composed of many sincs spaced at fb, which is very squarish which is one of the reasons why OFDM is so advantageous.
Closest peripheral aspect is transforming integrals according to Fourier, which extricates 1 cycle in f(x)=sin(x). A synchronization may take place in transient sine coefficient, where "Mag" and L2(R) column specify individual polar coordinates.
For example, the Fourier transform of Ly is. F[L(2)y] = L(2)˜y(k), where y(k) is the factor from L(2)y's distribution.
This feels like a very indirect way of saying "yes the fourier transform of a signal is a breakdown of its component frequencies, but depending on the kind of signal you are trying to characterize for it might not be what you actually need."
Its not that unintuitive to imagine that if all of your signals are pulses, something like the wavelet transform might do a better job at giving you meaningful insights into a signal than the fourier transform might.
The thinking that sinus are basic building blocks and own frequencies is part of the problem. Fourier is a breakdown into frequencies of "sinus" waves. Sinus are fundamental in physics of some idealistic conditions, but using Sinus is just a choice, mathematically you could just as good use other bases. A triangle has mathematically the same right to own a frequency as a sinus.
Reality is often different from ideal and not that linear. So basic wave-forms often aren't really sinus. But people usually only know sinus, so they'll use this hammer on every nail. Some guys into electrical engineering maybe know about rectangles, but there's, not yet, enough deeper understanding out there for playing with the mathematical tools correctly.
It's not just curve fitting because basis functions have characteristics which make them desirable for the kind of decomposition one is trying to find. We typically assume in factor analysis that factors are gaussian random variables without clear and repeating patterns. Fourrier transforms force us to think in similar terms but accounting for specific dynamics factor (I. E. Basis functions) should capture.
Also how do we construct those orthogonal basis functions for any downstream task is an interesting research question!
All quite good examples but I would say that these are quite well known. It’s also missing that there are mitigation strategies for some - for e.g. in vibration analysis it’s typical to look at the Hann windowed data to remove the effect of partial cycles, and it’s common to overlap samples too. Similarly there are other tools like the Cepstrum which help you identify periodic peaks in the spectral data.
Fourier tutorials are a dime a dozen, so it would likely have been a better idea to link to his excellent wavelet tutorial at https://www.continuummechanics.org/wavelets.html . Good explanations of that concept are a lot harder to come by.
> Because an FFT (short for "Fast Fourier Transform") is nothing more than a curve-fit of sines and cosines to some given data
That is not even wrong. A Fourier transform is a basis expansion. In particular, the full expansion is exact (not just an approximation). Of course, truncated expansions are approximations.
The actually interesting part: Why is this basis expansion so much more useful than, e.g. expanding into some eigenfunctions, Hermite polynomials, etc.? The decomposition into (complex) exponentials converts between addition and multiplication, i. e. sin(x+y), cos(x+y) you get from multiplying sin(x), cos(x), sin(y) and cos(y).
This in turn has important implications such as turning derivatives into multipliers.
More generally you can consider nonlinear Fourier transforms with different groups and generators other than exponentials.
TLDR: It is a transform. What you are transforming between is what makes it so useful.
seems to be missing some stuff. first, the notion that most real-valued functions can be decomposed to an infinite sum of orthogonal basis functions of which fourier bases are one. this is the key intuition that builds up the notion of linear decomposition and then from which the practical realities of computing finite dfts on sampled data arise. second, the talk of transients absent the use of stfts and spectrograms seems really weird to me. if you want to look at transients in nonstationary data, the stft and spectrogram visualization are critical. computing one big dft and looking at energy at dc to detect drift seems weird to me.
maybe this is the way mechanical engineers look at it, but leaving out stfts and spectrograms seems super weird to me.
> This is done by choosing A and B such that the following integral is minimized
Which is an absolutely subjective choice in an of itself and immediately breaks the notion that curve-fitting done that way is going to be telling you some absolute truth about the function.
For example, you might want, at each point of the non-linear curve being fitted, throw a line perpendicular to its tangent, compute the distance to the linear fit, and sum those distances over all points of the non-linear curve.
About as intuitively correct (if not more) as the "fit" proposed, yet yields a very different result.
Statistics are by definition subjective unless you use a specifically demonstrated property of the particular way you decide to project your data to the simple-minded underlying statistical model.
This gets brought up quite often here but something people don't talk about is why Fourier needed to do this. Historical context is really fun! In the late 1800s, Fourier wanted to mathematically describe how heat diffuses through solids, aiming to predict how temperature changes over time, such as in a heated metal rod. Observing that temperature variations evolve smoothly, he drew inspiration from the vibrating string problem studied by Euler and D’Alembert, where any complex motion could be expressed as a sum of simple sine waves. Fourier hypothesized that heat distribution might follow a similar principle; that any initial temperature pattern could be decomposed into basic sinusoidal modes, each evolving independently as heat diffused.
It seems the history is even more interesting. Supposedly, Fourier wanted to figure out how to keep wine cool, as a Frenchman is wont to do. See section 3 of https://www.tandfonline.com/doi/epdf/10.1080/0020739X.2017.1...
Minor correction, Fourier made his breakthroughs in the early 1800's. He worked under the reign of Napoleon and continued in the decade thereafter.
And it is an important correction because Fourier was immediately able to use his technique to solve partial differential equations, but it was many decades later before it was shown how it all works with the rigorous foundation of measure theory and functional analysis.
I generally agree with the point of the article ("Fourier transform is not magical").
However saying it is "just" curve fitting with sinusoids fails to mention that, among an infinite number of basis functions, there are some with useful properties, and sinusoids are one such: they are eigenvectors of shift-invariant linear systems (and hence are also eigenvectors of derivative operators).
I really don't get the point the article is making. I think the whole point about curve fitting is really a distraction, they could have simply stated that the FFT has periodic boundary conditions, so if you take the FFT of something that only extends a finite time of your sampling window, you will see your delta functions in the frequency domain spaced by the inverse of the length of your sampling window, i.e. the FFT "sees" your finite window as a pulse train. That's well known and a fundamental aspect of Fourier transforms.
But then the statements about the discontinuous "vibrations". E.g. in the case of the 1 Hz cycle over half the window the author states that:
> Yet the FFT of this data is also very complex. Again, there are many harmonics with energy. They indicate that the signal contains vibrations at 0.5Hz, 1.0Hz, 1.5Hz, etc. But the time signal clearly shows that the 'vibration' was only at 1Hz, and only for the first second.
The implication that there is a vibration only at 1Hz is plain wrong. To have a vibration abruptly stop, you need many frequencies (in general the shorter a feature in the time domain the more frequency components you need in the frequency domain). If we compare for example a sine wave with a square wave at the same frequency, the square wave will have many more frequency components in the Fourier domain (it's a sinc envelope of delta functions spaced at the frequency of the wave in fact). That's essentially what is done in the example the sine wave is multiplied by a square wave with half the frequency (similar things apply to the other examples). Saying only the fundamental frequency matters is just wrong.
This is also not just a "feature of the fitting to sines", it's fundamental and has real world implications. The reason why we e.g. see ringing on an oscilloscope trace of a square wave input is because the underlying analog system has a finite bandwidth, so we "cut-off"/attenuate higher frequency components, so the square wave does not have enough of those higher frequencies (which are irrelevant according to the author) to represent the full square wave.
> FFT has periodic boundary conditions
FFT is simply an algorithm to efficiently compute the DFT. The fact that the article makes no mention of DFT vs fourier series vs DTFT is going to end up creating more confusion that it solves. For some reason introductory tutorials always start with the DFT (usually mistakenly using FFT and DFT interchangeably), even though to me the continuous fourier transform is far easier to conceptually understand. Going from continuous fourier transform to the DTFT, is just applying the FT to a dirac-combed (sampled) function. Then from DTFT to DFT you introduce periodic boundary condition. Fourier series is just applying FT to a function that happens to already be periodic, resulting in a finite set of discrete frequencies.
There is a connection between fourier series and DFT in that if the fourier series is computed for the periodic resummation of a signal, and then the DFT is computed for the original signal (which implicitly involves applying a periodic boundary condition), the DFT is just the periodic resummation of the fourier series.
I spent ages meditating on this image https://en.wikipedia.org/wiki/Discrete_Fourier_transform?#/m... before everything finally clicked, it's a shame that introductions never once mention DTFT
Completely agree the transition between continuous and discrete domains is often glossed over and people use DFT, FT, DTFT and FFT almost interchangeably (I certainly have been guilty of that myself, and you are correct the FFT and DFT are equivalent for this discussion).
An interesting fact (somewhat related to your mentioning of the DTFT) is that one can consider the DFT as a filter with a sinc transfer function. That's essentially how you can understand the spectrum of an OFDM signal. You perform a block based FFT on your input bit/symbol stream, so you have waves at different carriers. However, because the stream is timevarying you essentially get sinc shaped spectra spaced at the symbol rate (excluding cycling prefixes etc.). So your OFDM spectrum is composed of many sincs spaced at fb, which is very squarish which is one of the reasons why OFDM is so advantageous.
Had a professor go through this and distinguish DTFT vs DFT, etc.
Sadly that wasn’t my linear systems class, which omitted this in both the lectures and textbook.
I made a video about a cool application of the Discrete Fourier Transform regarding color eink Kaleido 3 and manga:
https://youtu.be/Dw2HTJCGMhw?si=Qhgtz5i75v8LwTyi
Learning about Fourier is really interesting in image processing, I'm glad I found a good textbook explaining it.
Closest peripheral aspect is transforming integrals according to Fourier, which extricates 1 cycle in f(x)=sin(x). A synchronization may take place in transient sine coefficient, where "Mag" and L2(R) column specify individual polar coordinates.
For example, the Fourier transform of Ly is. F[L(2)y] = L(2)˜y(k), where y(k) is the factor from L(2)y's distribution.
This feels like a very indirect way of saying "yes the fourier transform of a signal is a breakdown of its component frequencies, but depending on the kind of signal you are trying to characterize for it might not be what you actually need."
Its not that unintuitive to imagine that if all of your signals are pulses, something like the wavelet transform might do a better job at giving you meaningful insights into a signal than the fourier transform might.
The thinking that sinus are basic building blocks and own frequencies is part of the problem. Fourier is a breakdown into frequencies of "sinus" waves. Sinus are fundamental in physics of some idealistic conditions, but using Sinus is just a choice, mathematically you could just as good use other bases. A triangle has mathematically the same right to own a frequency as a sinus.
Reality is often different from ideal and not that linear. So basic wave-forms often aren't really sinus. But people usually only know sinus, so they'll use this hammer on every nail. Some guys into electrical engineering maybe know about rectangles, but there's, not yet, enough deeper understanding out there for playing with the mathematical tools correctly.
It's not just curve fitting because basis functions have characteristics which make them desirable for the kind of decomposition one is trying to find. We typically assume in factor analysis that factors are gaussian random variables without clear and repeating patterns. Fourrier transforms force us to think in similar terms but accounting for specific dynamics factor (I. E. Basis functions) should capture.
Also how do we construct those orthogonal basis functions for any downstream task is an interesting research question!
All quite good examples but I would say that these are quite well known. It’s also missing that there are mitigation strategies for some - for e.g. in vibration analysis it’s typical to look at the Hann windowed data to remove the effect of partial cycles, and it’s common to overlap samples too. Similarly there are other tools like the Cepstrum which help you identify periodic peaks in the spectral data.
Fourier tutorials are a dime a dozen, so it would likely have been a better idea to link to his excellent wavelet tutorial at https://www.continuummechanics.org/wavelets.html . Good explanations of that concept are a lot harder to come by.
> Because an FFT (short for "Fast Fourier Transform") is nothing more than a curve-fit of sines and cosines to some given data
That is not even wrong. A Fourier transform is a basis expansion. In particular, the full expansion is exact (not just an approximation). Of course, truncated expansions are approximations.
The actually interesting part: Why is this basis expansion so much more useful than, e.g. expanding into some eigenfunctions, Hermite polynomials, etc.? The decomposition into (complex) exponentials converts between addition and multiplication, i. e. sin(x+y), cos(x+y) you get from multiplying sin(x), cos(x), sin(y) and cos(y). This in turn has important implications such as turning derivatives into multipliers. More generally you can consider nonlinear Fourier transforms with different groups and generators other than exponentials.
TLDR: It is a transform. What you are transforming between is what makes it so useful.
seems to be missing some stuff. first, the notion that most real-valued functions can be decomposed to an infinite sum of orthogonal basis functions of which fourier bases are one. this is the key intuition that builds up the notion of linear decomposition and then from which the practical realities of computing finite dfts on sampled data arise. second, the talk of transients absent the use of stfts and spectrograms seems really weird to me. if you want to look at transients in nonstationary data, the stft and spectrogram visualization are critical. computing one big dft and looking at energy at dc to detect drift seems weird to me.
maybe this is the way mechanical engineers look at it, but leaving out stfts and spectrograms seems super weird to me.
"Few people appreciate statistics. But at least they seem OK with this and don't go off starting religious wars over the subject."
Frequentist vs Bayesian get debated constantly. I liked this video about the difference:
https://youtu.be/9TDjifpGj-k?si=BpjlTCWIFMu506VL
> This is done by choosing A and B such that the following integral is minimized
Which is an absolutely subjective choice in an of itself and immediately breaks the notion that curve-fitting done that way is going to be telling you some absolute truth about the function.
For example, you might want, at each point of the non-linear curve being fitted, throw a line perpendicular to its tangent, compute the distance to the linear fit, and sum those distances over all points of the non-linear curve.
About as intuitively correct (if not more) as the "fit" proposed, yet yields a very different result.
Statistics are by definition subjective unless you use a specifically demonstrated property of the particular way you decide to project your data to the simple-minded underlying statistical model.