Principal components analysis (PCA) is one of the oldest and most important transformations of multivariate data analysis. The central idea is to generate linear combinations of the input data variables that are uncorrelated and have maximum variance. This reduces the dimensionality of the data while enhancing the features of interest.
In remote sensing this technique can be advantageously used to reduce the number of bands that are necessary for a certain analysis (i.e. classification) and so reduce computing costs while keeping as much as possible of the variability present in the data. Most GIS and remote sensing software packages in use today have implemented this function in some or another way. In practice, it is enough for an analyst to just press a virtual button to calculate the principal components of an image. This is comfortable but boring. It robs us of the fun of understanding the basic principles and see how this transformation works behind the scenes. Let’s have a look!
We here follow the explanation given by Canty (2007), although the method is well explained in many other textbooks (Schowengerdt, 2006, has a nice explanation too). The n bands of our image are the n dimensions of data. We project these bands into n new orthogonal bands, such that each of them is uncorrelated and has maximum variance. We then recast the problem as an eigenvalue problem and find the eigenvalues and eigenvectors. We can then create new bands by applying the linear transformations to our data.
All procedures are done combining different open source tools: written in Perl, using the Perl Data Language, Generic Mapping Tools, ImageMagick, GDAL and R. The code used for computing PCA can be found in the fighsare repository. Note that this is a very eclectic approach using tools I know. I make no claim to write neat code nor pretend my code to be best practice. I am sure others can do better!
We will use as an example image a subset of a Landsat 7 ETM+ scene path/row 193/018, acquired 2002-08-04, and depicting the city of Uppsala, Sweden, and its surroundings:
The six non-thermal bands in this image look like this:
We see that there is a significant correlation between these bands, particularly those that are spectrally close:
Using PCA we will find a new set of bands where this correlation is eliminated. We begin by finding the covariance matrix for each band pair (i,j), considering each band as a random vector with n elements:
Note that we use normalized bands, where the band’s mean value has been subtracted from each value. In our case we obtain:
[ 78.506782 88.10313 67.883564 11.007673 122.02949 114.89956]
[ 88.10313 113.44888 84.660135 48.694444 182.94694 149.97175]
[ 67.883564 84.660135 73.656183 4.140623 139.43477 121.92883]
[ 11.007673 48.694444 4.140623 318.58023 222.51396 71.33638]
[ 122.02949 182.94694 139.43477 222.51396 514.12654 331.0965]
[ 114.89956 149.97175 121.92883 71.33638 331.0965 263.77025]
We then recast the problem as an eigenvalue problem of the form:
Which leads us to the principal components, Y for a new, transformed image:
Where is the image vectors, and the fraction of the variance explained by each component is expressed by the relative size of each eigenvalue. Solving this in our case, we get:
[ 2.6671763 60.253522 5.0769476 293.87224 989.43693 10.782041]
[ 0.35415481 0.52674478 -0.69935703 -0.24793912 0.2031875 0.072492029]
[ -0.71512896 0.52983818 0.16791355 -0.20184223 0.28867851 -0.23577858]
[ 0.59003962 0.2254447 0.58621081 -0.26529723 0.21495833 -0.37522676]
[ 0.11698679 0.34437938 0.12185166 0.85551845 0.31144854 0.15478078]
[ -0.03577943 -0.50894229 -0.26075236 0.063341061 0.70744293 -0.40892322]
[-0.0072190021 -0.11561447 0.23711192 -0.30245558 0.48134893 0.77921945]
and after sorting:
Order of the eigenvalues:
[0 2 5 1 3 4]
[ 2.6671763 5.0769476 10.782041 60.253522 293.87224 989.43693]
Once we have figured out the relative order of the n eigenvalues, we calculate the ith component by adding up the product of each of the n bands with each of the n rows of the corresponding eigenvector:
And our result looks like this:
And we can see how these principal components correlate with each other:
As a rule the first principal components contain the largest part of the variability. The last principal component, for example, is mostly (but not only) noise. If we combine the first three principal components in one image, we get:
This technique is useful for reducing the number of bands needed to some processes, as it keeps the variability mostly untouched, which is what we actually need. This is handy when using multispectral data, and crucial when using hyperspectral data.
Canty, M.J. (2007). Image analysis, classification and change detection in remote sensing: with algorithms for ENVI/IDL. CRC Press.
Schowengerdt, R.A. (2006). Remote sensing: models and methods for image processing. Academic press.
The scatterplots were done with an R snippet embedded in a Perl script:
$list is a Perl variable containing the image list.