Geometric Transformation of Images

San José State University

applet-magic.com Thayer Watkins Silicon Valley & Tornado Alley USA

The Geometric Transformation of Images

This material concerns what might be called the spatial transformation of images. The transformation of the colors of an image is a different matter.

An image is given by the color at each picture element (pixel) over some range of the horizontal and vertical coordinates. In other words, an image is an array C(x,y) in which C is the color value and x and y are the integral horizontal and vertical coordinates, respectively. A geometric transformation of an image is where the color value at pixel (x,y) is transferred to new coordinates (xx,yy). There is a rule or formula for determining (xx,yy) from (x,y); i.e., (xx,yy) = T(x,y). The transformation T( , ) has to be such that only one point (x,y) is mapped into a point (xx,yy). Thus the transformation T has an inverse, denoted as T^-1, such that if (xx,yy)=T(x,y) then (x,y)=T^-1(xx,yy). For example, one simple transformation is to stretch an image so that its width and height are double the original values. In this case,

T(x,y) = (2x,2y)
and hence
T^-1(xx,yy) = (xx/2, yy/2)

Consider now how the transformation of an image, such as doubling its width and height, should be carried out. One possible method (which turns out to be wrong) is to take each (x,y) for the original image, compute its transformed values, (xx,yy)=T(x,y) and copy its color to the transformed pixel (xx,yy). Not all of the pixels for the transformed image would get color values from this procedure. There generally will be gaps in the array, as in the case of doubling the scale of an image, which must be filled in by some procedure. A possible procedure would be to use the average of the color values of the adjacent pixels. For a simple transformation such as scale doubling the above procedures could be made to work but for a more complicated transformation the coding problems would get horrendous. Fortunately there is an easier method.
The appropriate procedure is to take the coordinates for the transformed image and work backwards. That is to say, for each (xx,yy) is the transformed image find (x,y)=T^-1(xx,yy) and base the color at (xx,yy) on the (x,y) so found.
The values of T^-1(xx,yy) are not necessarily integral so some rule must be used for computing the color to be used at (xx,yy) from the colors of the pixels nearby to (x,y)=T^-1(xx,yy). While sophisticated rules can be constructed, in practice it is perfectly adequate to use some simple truncation or rounding off of the inverse values to integral coordinates. For the scale doubling transformation the color of each pixel in the original image gets spread over four pixels in the transformed image. In effect, the 1x1 pixels in the original image get converted into 2x2 pixels in the transformed image.
For a more sophisticated example than simple scale doubling consider the case of skewing (more properly shearing) the image horizontally by an angle θ, as shown below. The transformation and its inverse are:
(xx,yy) = T(x,y) = (x+ysin(θ), y)
and
(x,y) = T^-1(xx,yy) = (xx-yysin(θ), yy)

Below is an example which includes the original image along with the transformed image sheared 45 degrees to the left which gives the effect of a reflection in water.

HOME PAGE OF applet-magic
HOME PAGE OF Thayer Watkins

T(x,y) = (2x,2y)and hence T-1(xx,yy) = (xx/2, yy/2)

(xx,yy) = T(x,y) = (x+ysin(θ), y) and (x,y) = T-1(xx,yy) = (xx-yysin(θ), yy)

T(x,y) = (2x,2y)
and hence
T^-1(xx,yy) = (xx/2, yy/2)

(xx,yy) = T(x,y) = (x+ysin(θ), y)
and
(x,y) = T^-1(xx,yy) = (xx-yysin(θ), yy)