What does "frequency" mean in an image?
I don't understand how frequencies are defined in images/photographs. As far as I understand it now, high frequencies are like sharp things in images, like edges or so, and low frequencies are kind of the opposite?
I also would like to understand the outcome of Discrete Fourier Transformations, like how to read them properly.
It would be cool if somebody could explain to me the following:
What are frequencies in pictures and how are they defined?
How do you read the outcome of a Discrete Fourier Transformation?
I will only answer the first question: What are frequencies in images?
Fourier Transform is a mathematical technique where the same image information is represented not for each pixel separately but rather for each frequency. Think about it this way. The sea has waves some of which are very slow moving (like tides), others are medium in size and still some others are tiny like the ripples formed from a gust. You can think of them as three separate waves but at each point on the surface of the sea and a moment in time, you get just one height of water.
The same applies to images. You can think of the image being made up of various waves or frequencies. To create your image, start with the average colour (actually thinking of gray scale images is easier). Then add waves of different wave lengths and strength to slowly build up details in the picture.
First Frequency (Average):
The second frequency along the vertical dimension is a wave starting at zero at the bottom of the image, rising, becoming zero again along the centred horizon and falling below zero to finally become zero at the top of the image. (I described a Fourier Series without phase shift, but the analogy still holds.)
Here you can see the second frequency along the horizontal and vertical. Notice that you can make out where the mountain will be (dark) and where the sky and lake will be (lighter).
Each additional wave or frequency brings along more ripples and as such, more detail. To get different images, the wave height/amplitude can be changed as well as the starting point of the wave, also called the Phase.
Interestingly, the information amount is the same in this representation and one can go back and forth between normal images (spatial domain) and Fourier Transformed images (frequency domain). In the frequency domain we need to keep information of all frequencies along with the amplitude and the phase information.
Here it is using 50% of the frequencies:
There are variants of all this, with distinctions to be made among Fourier Series, Fourier Transform and Discrete Fourier Transform and Discrete Cosine Transform (DCT).
One interesting application is in the use of compression algorithms like JPEG. Here the DCT is used to save more of the important parts of the image (the low frequencies) and less of the high frequencies.
I wrote this in the hope that novice readers can get a basic understanding of the idea of Fourier Transforms. For that I made some simplifications that I hope the more advanced readers will forgive me.
Video generated by Thomas Devoogdt can be viewed at Vimeo.
Frequencies in Post-Processing
There are numerous methods that rely on frequencies for post processing, mostly because we never look at single pixels individually. Many algorithms work on frequency because it is more natural to think about them this way. But also because the Fourier Transform contains the same information we can express any mathematical operation (or post processing step) in the frequency and the spatial domains! Sometimes the pixel-wise description is better but often the frequency description is better. (Better primarily means faster in this context.)
One technique I would like to point for no particular reason except that it is artists working directly with frequencies and that is *frequency separation *. I am not going to describe it but you can see how it works on YouTube for both Photoshop and GIMP.
You create two layers one with the low frequencies and one with the high frequencies. For portraits you can do skin smoothing on the high frequencies without affecting the skin tones in the low frequencies.
This is some code to generate the above examples. It can be run as a simple Python program.
from PIL import Image from numpy.fft import rfft2, irfft2 import numpy as np def save_dims(ft, low, high, name): ft2 = np.zeros_like(ft) # copy the frequencies from low to high but all others stay zero. ft2[low:high, low:high] = ft[low:high, low:high] save(ft2, name) def save(ft, name): rft = irfft2(ft) img = Image.fromarray(rft) img = img.convert('L') img.save(name) def main(): # Convert input into grayscale and save. img = Image.open("input.jpg") img = img.convert('L') img.save('input_gray.png') # Do Fourier Transform on image. ft = rfft2(img) # Take only zeroth frequency and do Inverse FT and save. save_dims(ft, 0, 1, 'output_0.png') # Take first two frequencies in both directions. save_dims(ft, 0, 2, 'output_1.png') save_dims(ft, 0, 3, 'output_2.png') # Take first 50% of frequencies. x = min(ft.shape) save_dims(ft, 0, x/2, 'output_50p.png') def generateGif(): ''' Generates images to be later converted to a gif. This requires ImageMagick: convert -delay 100 -loop 0 output_*.png animation.gif ''' # Requires images2gif from code.google.com/p/visvis/source/browse/vvmovie/images2gif.py # from images2gif import writeGif img = Image.open('input.jpg') img = img.convert('L') # Resize image before any calculation. size = (640,480) img.thumbnail(size, Image.ANTIALIAS) ft = rfft2(img) images =  for x in range(0, max(ft.shape)): ft2 = np.zeros_like(ft) ft2[0:x, 0:x] = ft[0:x,0:x] rft = irfft2(ft2) img_out = Image.fromarray(rft).convert('L') fname = 'animation/output_%05d.jpg' %(x, ) img_out.save(fname, quality=60, optimize=True) #writeGif('animation.gif', images, duration=0.2) if __name__=='__main__': main() #generateGif()
It should be clarified that, while theoretically we could, assuming we had infinite knowledge of the image at hand, decompose it to component frequencies and recompose it with no loss...in the real world we can't. Convolution of a real world image, which occurs at each and every "interface" along the optical pipeline, is effectively an irreversible process. We can't ever know all convolution factors, and therefor reconstruction of an FFT back into an image is difficult, and extreme modifications usually result in artifacts and data loss.
@jrista I think the point Unapiedra was making about reversability was that *once you're working with a digital image* (an array of pixels on a computer), you can go to frequency space and back, and get the same image you started with. You're looking at a bigger picture of the physical imaging system (lenses and such), where real-world limitations intrude.
@coneslayer: I am thinking about the purpose of analyzing and working images in a spatial frequency space at all. One doesn't just convert to spatial frequencies and back...you convert an image to spatial frequencies in order to perform some kind of processing on it, that may be either inherently easier on a frequency, or more accurate/capable on freqencies. Due to the nature of convolution, however, one cannot freely make changes to spatial frequencies and convert back without some kind of loss. I guess I feel a complete explanation of images as wavelets involves "WHY?"
jrista's comment is misleading in that FT is blamed for information loss. Of course, photography is a lossy process and so is post-processing. If I convert a discrete image to Fourier Space, do some lossy processing there, and then convert back, of course I loose information. But it happens in the processing step and not in the conversion step. True, because of machine precision every mathematical operation looses information but if we are talking about 8 bit per channel images, we won't notice machine precision errors.
@Turkeyphant , I don’t remember why I mention diagonal in that context. You can see that the principal direction of the second frequency seems to be that particular diagonal. Maybe that’s why. To answer your question, you only ever need two axes to represent a 2D image. It’s important that the two axes are orthogonal. Horizontal and vertical axis fulfill that criterium. (Also they are practical.) With discrete images (i.e. composed of pixels), aliasing will make all other angles worse.
@Unapiedra Thanks. That makes a lot more sense with my limited memory of frequency space. One bit that's still a bit confusing is your description of the first frequency being a single sinusoidal wave cycle from 0, -1, 0, +1, 0 while the image seems completely uniform from top to bottom.
@Turkeyphant correction, Unapiedra described the _second frequency_, not the first, as 0, -1, 0, +1, 0. The paragraphs describing the 2nd frequency are immediately after the 1st frequency image (the uniform gray image), and I can see how it might be tempting to read that paragraph as a description of the preceding image (articles often show an image, then describe it in text following the image), but not in this case. =)