Wednesday, July 30, 2008

A11 - Camera Calibration

In this activity we model the geometric aspects of image formation to recover information that was lost in the process of projecting brightness values from surfaces existing in 3D world space to 2D sensor space.

To do this, we first take a picture of a checkerboard pattern known as Tsai Grid.



Using SciLab's locate function, we selected 25 points on this image to get the pixel value of these points. We also take note of the real world coordinates of these points by letting the left side of the board to be the x-axis, the right side as the y-axis and the vertical as the z-axis. Each square has a side length of one inch. The origin is shown in pink below.



The green dots are the 25 points we've selected for our calibration. Next, we set up the matrix Q shown by equation below for values of i from 1 to 25. (i stands for image coordinate, o for real world object coordinate)



We then solve for a using equation below.




The resulting values of matrix a can then be used in the following equations to get the resulting 2D image coordinate by knowing the real world coordinates of the point. (a_34 is set to 1)



Implementing this method:

Real world coordinates of the green dots:
1. (8,0,12)
2. (6,0,10)
3. (2,0,10)
4. (4,0,9)
5. (6,0,8)
6. (6,0,3)
7. (4,0,2)
8. (2,0,3)
9. (6,0,5)
10. (4,0,3)
11. (0,8,12)
12. (0,5,10)
13. (0,2,10)
14. (0,5,8)
15. (0,3,7)
16. (0,5,4)
17. (0,7,2)
18. (0,2,1)
19. (0,3,3)
20. (0,5,1)
21. (0,0,1)
22. (0,0,3)
23. (0,0,5)
24. (0,0,6)
25. (0,0,11)

Corresponding Image Coordinates (Pixel Value)

point y_image z_image
1 23.214286 235.11905
2 53.571429 200
3 108.33333 205.35714
4 82.738095 185.11905
5 54.761905 163.09524
6 57.738095 73.214286
7 85.714286 63.095238
8 110.11905 86.904762
9 55.952381 108.92857
10 85.119048 80.357143
11 246.42857 236.90476
12 201.19048 202.38095
13 159.52381 205.95238
14 201.19048 166.66667
15 172.61905 152.97619
16 199.40476 80.357143
17 227.97619 55.952381
18 159.52381 55.357143
19 172.02381 85.119048
20 198.80952 45.238095
21 133.92857 60.714286
22 134.52381 92.261905
23 133.92857 125
24 133.33333 141.07143
25 132.7381 224.40476

Resulting matrix a
-13.2561
9.736856
-0.88994
134.7117
-4.07642
-4.2728
15.07773
45.71427
-0.01371
-0.01504
-0.00559

New image coordinates after using the matrix a
point y_new z_new
1 21.842763 235.67576
2 53.69017 199.59722
3 108.32033 205.44708
4 82.330771 184.49906
5 55.041758 162.50365
6 58.274085 73.794279
7 85.553674 63.772888
8 110.40641 86.620738
9 57.005504 108.60976
10 85.109881 80.396867
11 248.48566 236.84126
12 200.81792 201.5441
13 158.94669 205.61779
14 200.29104 164.72204
15 172.19655 151.17308
16 199.27642 93.813046
17 227.597 52.018523
18 158.96577 54.178965
19 171.88968 83.28296
20 198.54783 42.893636
21 134.57351 61.133497
22 134.29241 92.497568
23 134.00484 124.58259
24 133.85857 140.90327
25 133.10108 225.42079

Getting the difference between these computed values with the actual values gives the following mean values for each coordinate:
y: 0.474705
z: 1.365954

-o0o-
Collaborator: Cole Fabros

-o0o-
Grade: 10/10 since I implemented the camera calibration well, and that the mean difference for each axis is within one pixel. :)

Tuesday, July 22, 2008

A10 – Preprocessing Handwritten Text

In this activity we tried to extract handwritten text from an imaged document with lines. The image I used is shown below.



To remove the lines, I first obtained the FFT of this image.



The lines in the image are frequent, therefore, we can conclude that to remove the lines we need to suppress the frequencies that correspond to these lines.



The enhanced image after working on the Fourier space is shown below.



We then binarized this image for further cleaning process.



The image below is a result of performing opening operation on the binary image.



I repeated the same opening operation on the image, ad this results to the following image.



In both opening operations, the resulting image shows no success. The text are not readable. Part of the lines were even enhanced.

-o0o-
Thanks Ed for pointers and help in GIMP.

-o0o-
Rating: 5/10 since I was only able to remove the lines from the original image and that I wasn't successful in extracting the text from the image. The resulting image was also very dirty.

Monday, July 21, 2008

A9 - Binary Operations

The goal for this activity is to integrate everything we have learned so far to determine the best estimate of area (in pixel count) of simulated “cells”. We will also learn to use more morphological operators for enhancing and analyzing binarized images.



CLOSING and OPENING Morphological Operations
Opening is defined as an erosion followed by a dilation using the same structuring element for both operations. The basic effect of opening is somewhat like erosion in that it tends to remove some of the foreground (bright) pixels from the edges of regions of foreground pixels, but is less destructive. ***

Closing is defined as a dilation followed by an erosion using the same structuring element for both operations. Closing is similar in some ways to dilation in that it tends to enlarge the boundaries of foreground (bright) regions in an image (and shrink background color holes in such regions), but it is less destructive of the original boundary shape. ***



Getting Cell Area
The above image was first sampled by cropping 256x256 images. This was done because of two reasons. One is to save memory space since working with large images correspond to larger memory and two is to obtain an accurate value for the cell area by sampling on different regions of the image.

Below is a 256x256 cropped image. We first binarized this image to simplify the separation of background from region of interest (ROI). The optimum threshold can be found by examining the histogram of this image.


The binarized image is shown below.



Performing OPENING on the image results to the following image. The white pixels which are part of the background were removed.



Performing CLOSING operation results to the image below. Separation of touching blobs were observed.



We then use bwlabel function in SciLab to returns an image matrix which is of the same size as the original image but now containing labels for the connected objects in it. From this matrix, we can get the area of the cells since the each blob is a set of connected pixels.

We repeated the above procedures for different sample images and obtained a histogram of the cell area.


From this histogram, we can deduce that the pixel area of a cell in the image has a value between 500-550 pixels. Reducing the x-axis to 450-550 pixels shows the following histogram. This was done to limit the calculated area to those that best represents the cell area, since bins with smaller values can be accounted to small connected components, which is unlikely area of the cell.


We then get the mean and the standard deviation of the cell area obtained. The values were
area = 521.84444 pixels
std dev = 20.685915

We then checked if this values are correct. This was done by getting the cell area from an image with separated cells.



The values obtained are as follows
544.
527.
518.
535.
530.

Clearly, these values lie within the acceptable limit for the calculated area of the cell. The mean value obtained is thus correct. :)

-o0o-
Thanks Jeric for the tips on getting the histogram of the areas obtained.

-o0o-
I give myself a 10 for I implemented the necessary techniques well.

Tuesday, July 15, 2008

A8 - Morphological Operations

Morphology refers to shape or structure. In image processing, classical morphological operations are treatments done on binary images, particularly aggregates of 1's that form a particular shape, to improve the image for further processing or to extract information from it. All morphological operations affect the shape of the image in some way, for example, the shapes may be expanded, thinned, internal holes could be closed, disconnected blobs can be joined.

In this activity, we learned two basic morphological operations done on images --- DILATION and EROSION.

From Dr. Soriano's lecture notes, dilation is defined as follows.



On the other hand, erosion operator is defined as



From these definitions, we predicted the outcome of an image after dilating and eroding it with certain structuring elements.

We used the following binary images as test objects:

square (50×50)
triangle (base = 50 , height = 30)
hollow square (60×60, edges are 4 pixels thick)
plus sign (8 pixels thick and 50 pixels long for each line)
circle (radius 25)


The structuring elements used are shown below.
1. 4×4 ones
2. 4×2 ones
3. 2×4 ones
4. cross, 5 pixels long, one pixel thick.



We then check our predictions using SciLab's dilate and erode functions.

DILATION
The dilation results for each structuring element is shown below







EROSION
The erosion results for each structuring element is shown below







THIN
According to SciLab:
thin - thinning by border deletion
Function thin performs thinning of binary objects. It uses the Zhang-Suen, a de facto standard and simple technique. The resulting image, the skeleton, is not always connected and is very sensible to noise. For thin shapes, it should work faster and provide better quality. You will need some pruning criterium to eliminate spurs.



SKEL
According to SciLab:
skel - skeletonization, thinning, Medial Axis Transform
Function skel performs skeletonization (thinning) of a binary object. The resulting medial axis is multi-scale, meaning that it can be progressively pruned to eliminate detail. This pruning is done by thresholding the output skeleton image.
The algorithm computes skeletons that are guaranteed to be connected over all scales of simplification. The skeletons are computed using the euclidean metric. This has the advantage to produce high-quality, isotropic and well-centered skeletons in the shape. However the exact algorithm is computationally intensive.




-o0o-
Notes:
The predictions were visualized and written in a piece of paper. I'll try to scan them for posting and for comparison as well. For now, I could only say that some of my predictions, as to what the outcome of the image looked like after dilation and erosion using the given structuring elements, did not match the simulation done in SciLab. I would say 70% of my predictions were correct. I had difficulty especially with predicting images after erosion. Also, I wasn't so sure about my prediction in using the cross structuring element.
With all these reasons and knowing that I'd only predict roughly 70% of the outcome of the images, I give myself a 7/10 for this activity.

-o0o-
Collaborators:
Thank you JC Nadora, Jeric Tugaff, Toni Lei Uy, Benj Palmares, Cole Fabros, Ed David, Angel Lim, Mark Bejemino for the invaluable discussions regarding erosion and dilation, especially with prediction. More thanks to Ed for pointers in drawing the test objects in GIMP. (I was doing it in Paint with so much difficulty so he showed me how to do it easily.)

Wednesday, July 9, 2008

A7 - Enhancement in the Frequency Domain

Anamorphic Property of the Fourier Transform

Varying the frequency of the sinusoid changes the location of the FFT peaks from (0,0). As observed, higher frequencies would result to peaks farther from the origin.



Rotation of the image, on the other hand, corresponds to the rotation of its FFT but on the other direction.


Multiplying two sinusoids result to peaks which form the corner of a square. Adding them, however, result to a cross-like pattern, wherein the vertical dots correspond to fft peaks of the horizontal sinusoid and horizontal dots to vertical sinusoid. The effect in the Fourier space is much like adding each FFT.



Filtering in Fourier Space

a) Fingerprint Enhancement

Shown below is a fingerprint image. We want to enhance the ridges present on it by filtering in Fourier space.


We know that the ridges would correspond to high frequencies since they are repetitive patterns present in the image. Also, we can think of them as sinusoids in different directions. Therefore, we get the FFT of the fingerprint image.



We then designed a filter mask using mkfftfilter function in SciLab. Since we want to enhance the higher frequencies corresponding to the ridges, we design a high pass filter. Shown below is a high-pass exponential filter.



We multiplied this filter with the original FFT of the fingerprint image. After this, we again get its FFT to get the following enhanced image.



The FFT of the enhanced image is shown below. The lower frequencies were removed thereby resulting to enhancement of the higher frequencies.



We then tried using a high-pass binary filter shown below.



The resulting image is shown below.



b) Lunar Landing Scanned Picture: Line Removal

The two groups of irregularly shaped craters north and west of the landing site are secondaries from Sabine Crater. This view was obtained by the unmanned Lunar Orbiter V spacecraft in 1967 prior to the Apollo missions to the Moon. The black and white film was automatically developed onboard the spacecraft and subsequently digitized for transmission to Earth. The regularly spaced vertical lines are the result of combining individually digitized 'framelets' to make a composite photograph and the irregularly-shaped bright and dark spots are due to nonuniform film development. [NASA Lunar Orbiter photograph]


By filtering in the Fourier domain, we can remove the vertical lines present in this image. We first get its FFT.



Using the hints from http://www.roborealm.com/help/FFT.php, we designed a filter mask such that the peaks which lie horizontally along the center of the FFT image will be suppressed. These peaks corresponds to the strong vertical lines in the original image. The filter mask we designed is shown below.



This mask is then multiplied to the FFT of the original image to get the following enhanced FFT.



From this enhanced FFT, we obtain the following enhanced image which obviously has no vertical lines present. Clearly, the image was enhanced.



-o0o-
Collaborators:
Jeric Tugaff for the FFT discussions and hints on filtering in the Fourier space.
Cole Fabros, Benj Palmares and Ed David for the FFT discussions.

-o0o-
Rating:
I give myself an 8/10 because
1. I'm not satisfied with my fingerprint enhancement algorithm. SciLab's mkfftfilter seems not suitable for this job.
2. I'm happy with the results of the Lunar Image tho, we had a hard time figuring how to do it properly. I'm thankful for Jeric's resourcefulness for he found hints in the internet.