Tuesday, July 22, 2008

A10 – Preprocessing Handwritten Text

In this activity we tried to extract handwritten text from an imaged document with lines. The image I used is shown below.



To remove the lines, I first obtained the FFT of this image.



The lines in the image are frequent, therefore, we can conclude that to remove the lines we need to suppress the frequencies that correspond to these lines.



The enhanced image after working on the Fourier space is shown below.



We then binarized this image for further cleaning process.



The image below is a result of performing opening operation on the binary image.



I repeated the same opening operation on the image, ad this results to the following image.



In both opening operations, the resulting image shows no success. The text are not readable. Part of the lines were even enhanced.

-o0o-
Thanks Ed for pointers and help in GIMP.

-o0o-
Rating: 5/10 since I was only able to remove the lines from the original image and that I wasn't successful in extracting the text from the image. The resulting image was also very dirty.

1 comment:

Jing said...

Julie, the problem here was that your blobs of interest, the letters, was in black and the background was in white. Thus the opening and closing operations acted on the background. You would have to invert the binary image first. Also, a judicious choice of structuring elements would be able to stitch together a letter divided by a line.