VIEWS: 3 PAGES: 11 POSTED ON: 11/13/2011
“The term texture generally refers to a repetition of basic texture elements called texels. The texel contains several pixels, whose placement could be periodic, quasi- periodic or random. Natural textures are generally random, whereas artificial textures are often deterministic or periodic. Texture may be coarse, fine, smooth, granulated, rippled, regular, irregular, or linear.” (A. K. Jain: Fundamentals of Image Processing) My independent study on the texture analysis project is a continuation of a series of experiments that I have started the last semester. In the fall, I investigated black and white photographs depicting mostly natural scenes. My goal was to understand whether the analysis of texture or the roughness of surfaces could allow me to interpret an array of gray-scale intensities, and separate the pictures into different regions. For my analysis, I mainly used statistical methods and the Fast Fourier Transform to describe the input. This semester, I concentrated my attention on five new images that I acquired from a medical laboratory. [Figure 1] The black and white images depict rat abdomen segments with serious lesions on the lower portion of the (medically prepared) animal organ. Besides continuing my previous studies in describing and interpreting the notion and characteristics of “texture”, my plan was to give an interpretation of the above mentioned clinical images. One example of that latter goal could be to determine an estimate of the area of the damaged tissues. To distinguish the heavily damaged portions of the tissue from the rest of the environment the following sequence of steps has to be completed: separation of the object from the background, separation of the upper and lower parts of the organ (the lesion is visible only on the lower portion!), the recognition and characterization of the small dark entities of lesions. These procedures all involve several interesting problems related to computer vision. For example: line searching, image preprocessing, and intensity histogram evaluation. It was towards the end of the fall semester, when I started analyzing the output of the Fast Fourier Transform. As the experiments seemed to be promising, and I was not 1 able to become familiar with all the aspects of that Mathematical formula, I spent almost two months with creating test cases for my Fast Fourier Transform implementation this spring. I wanted to be aware of how powerful this method could be as a first step of an image processing routine. An image is a spatially varying function. The Fast Fourier Transform examines the spatial variations by decomposing an image function into sinusoidal (Fourier) functions. It is a continuous function that can convert an intensity image (image expressed by the gray scale values of its pixels) into the domain of spatial frequencies. In order to assure myself about the correctness of the algorithm, I created some black and white images with sinus waves and tested the outcome of the FFT code on them [Figure 2]. For example, in case of these vertical sinus curves, the image of the FFT output is a horizontal line. The latter always indicates the orthogonal direction to the main direction on the original image. This approach allowed me to deeply understand the transform and to realize what effect a change of periods and function coefficients have on the displayed images. Increasing the period of the function would increase the frequency, and changing the coefficients would merely modify the range of the function image. These tests also helped me in forming justifiable estimations about what outcomes I could possibly expect in case of random photographs. In my project, the FFT pictures mainly consist of values of 255 and 0’s. The explanation for that is the following: the output values of a Fourier transform vary over a huge range. The lowest value is always approximately –9000 and the upper value is in the 20000s. Thus, it is impossible to discard these extreme values and just calculate with the rest of the data. Otherwise, too much information would be lost. Thus, in this way, after normalizing the gray-scale values and typecasting the results into integers, the majority of the data becomes zero or 255. That is the reason why usually only two values are visible on the graphs. I had an interesting way of implementing the Fast Fourier Transform. Instead of processing one big portion of the input photograph as a whole, I divided it up into 300 sub-blocks, each of size 32 x 32 pixels. I was curious whether comparing the output of the transform of these sub-images would let me describe similarities and differences 2 between different objects and surfaces. The results of this multiple FFT method clearly separated the background from the object, and also indicated variances inside the observed tissue. [Figure 3] The background, with low frequency, is represented merely by a white dot in the middle of the x-y-coordinate system. All other regions with higher variance acquire a different FFT output. The next problem to attack was to find a way to compare these results and describe their relationship to each other. At that time, I have just read about Pyramid Processing and I thought that approach might be useful in my attempts. A resolution pyramid is a generalized image data structure consisting of the same image at several, successively increasing levels of resolution. The value at each pixel at level j can be regarded as a weighted sum of the values of a small number of pixels below it. The higher the resolution is, the more samples are required to represent the increased amount of information. Hence, the higher the processing level is the larger its size becomes. .) The finest level of the pyramid is the input image itself. The dimensions of the representation arrays typically increase by a factor of 2 between adjacent levels of the pyramid. (In the traditional version of the pyramid, this relationship is strictly one-to- four.) The idea behind this procedure is to complete complex calculations on the lower level and then refine the results by approaching the higher levels. There are two ways of implementing this approach. One can reduce our image into a lower level by sub-sampling the filtered image (e.g. selecting one random representative pixel out of the 4), and then carrying out some complex computations on the smaller image. Or this method can be reversed. The implementation should depend on the weight of computations at each level. I experimented with the first method. I took the original, 480 x 640 black and white image and then arrived at a 160 x 120 representative by twice applying the following average taking process. For each block of four pixels in the original image, I calculated the average intensity and this value represented the 4 x 4 region on the “first- step” of the pyramid. I repeated the same method to obtain the final, second-level image. [Figure 4]. 3 Before trying to analyze the low-level pyramid picture, I started implementing my algorithms on the higher level, where even the fine changes were visible. I submitted the regular sized image with 300 image blocks to the FFT, and when the results were displayed on the screen. Then I applied one of my three comparison algorithms to describe the characteristics of the output. I tried to find all the image blocks that demonstrated similar properties and color them with the same gray-scale value. The different matching procedures are described in the following. 1. My first method is essentially the same as growing regions in order to find all the image blocks that belong together. I began my operation in the upper left corner of the image. This image block is referred to as the “comparison base”. It is assigned the highest possible gray-scale value, 255. After calculating the “distance” between the base and another image block, a color value is assigned to the second one. If the calculated distance exceeds a predetermined threshold, the color attribute of the second block becomes the base color minus a constant. Otherwise it remains unchanged. This procedure is repeated until the base block has been compared to all the rest of the images. Then, I take the first image block possessing a lower intensity value relative to the base one. Name it the “base” and its color the “base color”. Apply this process recursively until no block with a lower intensity value (compared to the current base) can be found. [Figure 5] Although the result of this procedure is satisfactory, and it does not depend on the location of the first base block, it demands a lot of CPU time. This inefficiency, however, can be corrected. Instead of decreasing the value of the compared block only by a constant value, it should be decreased proportionally to the computed difference. In this manner, it is enough to parse the input image only once (and not 255/constant times in the worst case). This enhancement has its drawbacks. If the first base block is not selected properly, then the output of the comparison is less meaningful. As the title of “base block” is assigned only once, the starting base block has to reside in an area of “interest” to our experiments. Therefore, one should attempt to pick regions that are potentially rich in information (e.g.: highly varying surfaces or borderlines of different objects) and try to avoid areas of low frequency (such as backgrounds, or plain fields.) It is extremely 4 difficult to determine a completely general method for finding an adequate starting block on a yet unseen image. 2. My second algorithm, similarly to the above-described one, assigns a specific color value to a whole image block. But, instead of arbitrarily selecting this value, the color attribute is computed. The algorithm focuses on intensity distribution. It calculates the average intensity of the block, and estimates whether the FFT outputs are more dominant along the x or y coordinate axis. This code provides a crude approximation for calculating similarities between images. Its output is meaningful, nonetheless does not contain a lot of information for further analyzing the image regions. [Figure 6] 3. The third method also calculates the distance between two image blocks. Here, however, each block is compared to only two other blocks: the neighbor to the right and the neighbor below. If their distance exceeds 15% of the sub-image content, then a borderline is drawn between the blocks. (The percentage rate can be calculated as 15% of the total number of pixels in the image block, as the graph essentially consists of two gray-scale values: 0 and 255.) This method again clearly separates the object of primary interest from the background, but a finer comparison step is still missing. [Figure 7] At this stage of the experiments, it seemed to be difficult to make a progress by merely applying the Fast Fourier Transform. Therefore, I started looking for other methods. The simplest procedure, that could quickly assist me in drawing a borderline around the examined object was ad hoc determining a threshold value and painting every pixel below that value to “x”, and all the other values to “y”. (The values for x and y are non- negative integers between 0 and 255). This technique can provide one with bits of useful information, but a lot of detail remains hidden. The fixed threshold value is also a serious limiting factor. To be able to use a more general algorithm, it is necessary that the threshold value be calculated based on the actual/current image characteristics. In order to gain more information about the original image and its intensity distribution, one can construct a gray-scale histogram. A gray-level histogram is a function that gives the frequency of occurrence of each intensity value in the input image. The intensity histogram is a 255- 5 long integer array that contains, at each element, the number of pixels holding that value on the image. In case of the rat images, one can observe that, the majority of the picture elements form clusters around a small percentage of intensity values. (Chart 1) This means that the images could be, in fact, described by merely using a limited set of gray-scale values. By examining the rat image intensity histograms, it is clearly visible, that the images could be represented by merely three values. (That is to obtain a general characterization of the input photograph.) The two local minimum values that separate the mostly “favored” intensity values are clearly visible, (around 162 and 225 on Chart 1) but it is a great challenge to find these computationally. To ease the calculation task, and emphasize the high frequency (more “interesting”) portions of the input array, I applied an image enhancement technique. Image enhance techniques are used to focus on and sharpen image features for display and analysis. They can be applied during pre- or post-processing. In computer vision, specifically, they are used as a pre-processing tool, for example, to strengthen edges of an object. By nature, these operations are application specific and require problem domain knowledge. Therefore, they have to be developed empirically. Enhancement methods operate in the spatial domain by manipulating the pixel data or/and in the frequency domain by modifying spectral components. The operations can be grouped into three major groups: point operations: each pixel is modified according to an algorithm that is not dependent on any other picture element’s intensity value mask operations: the pixels are modified according to the intensity values of the neighboring pixels global operations: the pixel values are changed by taking all the pixel values into consideration Gray-level scaling procedures belong to the category of point operations and operate by changing the pixel values by a mapping equation. A mapping equation is generally linear and maps the original gray-scale values to other, specified intensities. Some of the most frequent operations are shrinking, stretching and intensity-scale slicing. They are most frequently used for feature and contrast enhancement. 6 An alternate procedure to gray level scaling is histogram modification. Stretching and shrinking operations also exist in this case, and they are often referred to as histogram scaling. In general, an image with a wide spread of intensities has a high contrast, whereas an image with its histograms clustered at the low end of the range is black and the histogram with the values gathered at the high end of the range corresponds to a bright image. The rat pictures belong to the latter category. Because of the high percentage of white background region, there is great number of pixels gathering at the high end of the range. Some histogram modification algorithms are histogram slide (modification that retains the original relationship between the picture elements), histogram equalization (technique that can be used for converting the intensity distribution into as uniform as possible) and histogram specification (interactive way of histogram manipulation). These histogram manipulation operations usually improve the detectability of picture features by expanding the contrasts. In my project, I defined a histogram modification algorithm myself. It is called "color spreading", and it combines both a clipping and a stretching operation. Stretching a histogram has the effect of increasing the contrast of a low contrast image. (The shape of the original histogram, the relative distribution of the gray-scale intensities remains the same.) The clipping method is also necessary, as there are some so-called “outliers”. This expression refers to a small set of values that forces the histogram to span the entire range. Clipping a small percentage of picture element values at the low and high end of the range, in this case, can be extremely effective. My mapping equation is one that allows stretching the values of a subinterval to the whole interval. To make the distribution more even over the whole interval of intensity values, though, I apply the clipping algorithm first. I trim off a certain percentage of the gray-scale values from the two ends. I declare the min and max values for the algorithm at the gray-scale values where the sum of the pixel elements, counted from the beginning of the interval, exceeds the given percentage of the total number of pixels. Then the gray-scale intensities are recalculated to cover the 0-255 range. [Figure 8 & Chart 2] This function should theoretically be continuous, however, because of truncation defaults every third value is evaluated as zero. This makes the already difficult local min-max search even more convoluted. [Figure 10] shows the output of the color spreading operation followed by 7 the 3_pixel_comparison to draw the borderlines. Examining this image and [Figure 7], one can see that the drastic difference between some of the sub-image blocks decreased due to the histogram manipulation, and the borderlines between those blocks are not depicted any more. After a lot of trials, I found a way to approximate two local minimums on the histogram that give sufficient results in all 5 cases of the rat images. Thus, I manage to paint my images with three colors, black, gray and white. [Figure 9] Then I apply my old 3_pixel_comparison algorithm, to paint the dividing borderline between all image pixels that are of different values. The last step is to color all the borderlines with black and the rest of the picture into white. One of the resulting images can be seen at [Figure 11]. With this method, not only I can distinguish the rat abdomen from the background, but also I find an approximately continuous borderline to it. From this stage, the next step is to separate the lower portion from the upper part with a line and eliminate everything else in the picture. Then one should analyze the original pixel array for that restricted data set to locate the lesions and calculate their area. The most recent problem that I was working on is noise removal. The need for this operation occurred to me after finally obtaining the rough outline of the examined objects. [Figure 11] This operation is necessary as individual pixels or pixel groups can often satisfy the coloring criteria, but they do not hold relevant information to my analysis. Hence, my intention is to clear them from my output display. It is a complicated problem to describe and locate these noise values, because it is impossible to predict their shape, size and relationship to the “true” values in advance. I have two functions written already: clearImage and clearSpots. In both cases, I use a window to parse the whole image that is under analysis. The “clearImage” function handles a 3 x 3 window. The examined pixel is located in the top left corner. If that particular pixel is black (indicating a borderline between different objects/regions) but all the other pixels in the window belong to the background, the examined pixel is deleted. It is assigned the background intensity value. The “clearSpots” algorithm applies a bigger window that is 7 x 7 in size. If there are any pixels in the 3 x 3 center of this block that vary form the 8 background color and all the surrounding picture elements do belong there, the middle portion is cleared. Both of these methods attack only small-sized noises. They are perfect tools for final output refinement, but they are not sufficient for deleting bigger, isolated picture element blocks. In my texture analysis, these two methods are called one after another. First the bigger window is activated, and then the “clearImage” function deletes all the leftovers. [Figure 12] I am satisfied with their performance. However, I am still looking for an operation to separate the borderline of the area covered with the lesions. As I have already hinted in the above sections, throughout this semester, I encountered a lot of difficult problems. In the following, I will describe two of these, the threshold and the min-max search in particular. I always had concerns about using threshold values in my computations. It seems that applying a threshold value just weakens an algorithm, as it implies some background knowledge about the image being processed. In case I intend to write a general code (or a program that can be used with photographs different from my 5 clinical images), counting on some image specific features is be extremely limiting. For example, in case of my rat images I could break down the histogram into three different sections that represented the regions fairly well. However, applying the same operation to another image, for example to a natural landscape, does not function properly. The latter scene could have (and probably does have) a lot more variations overall, and the majority of the picture is of higher frequency. Hence, it is not correct for me to assume that there exist three main gray-scale values, which could represent the image at a base level. Therefore, I have intended to create some algorithms that could be applied to a wide range of images. These functions compute the threshold and limit values from features of the original input photograph. One example of that is the comparison method called compareFFTs. Here, I encounter the similarities between image blocks implicitly. I calculate three characteristic features of an image (average value, row and column distribution of white colored pixels), and then assign a gray-scale value computed from these results to each image block. [Figure 6] 9 On the other hand, I have to admit that if I am working with only a small set of images; it is unreasonable to give up any extra information that could improve my analysis. The other problem that occupied my mind for a long time was the local min-max search on a histogram. How can I determine the critical values on a given interval after I have found one local minimum or maximum? How can I be assured that a “high” value does not belong to the same sub-curve where the other maximum was found? My search became even more difficult, as after the image pre-processing technique (which modifies the histogram distribution) the histogram became discrete. Hence, I had to disregard certain values in my comparison, and could not assume properties of a continuous function any more. I tried to characterize certain intervals instead of just one array element in the search, but this method did not turn out to be a lot more useful either. Because of truncation and typecasting, the values belonging to an array element might change drastically (not smoothly), so the characteristics of an interval is more complicated to predict as well. Finally, my solution to the problem was to use some heuristics. I implement some test cases examining the neighborhood of the potential critical points, and then decide whether that candidate is probable to qualify (at least theoretically) for being a local minimum or maximum on the given interval. That method performs well on the set of images that I possess right now, however, I have not had a chance to test it thoroughly in case of other types of photographs. In summary, although I have not been able to fulfill all my plans this semester, I became aware of many new image-processing methods and managed to obtain some essential information after processing the input photographs. The final steps of my image interpretation process are still missing, nevertheless, I have created a collection of techniques (image preprocessing, the Pyramid Process, histogram manipulation, etc.) that could prove to be useful in my further studies and experiments, too. If I were to continue the texture analysis project, I know exactly what I would do next. As the 3 color histogram analysis and the noise clearing functions already provide a good approximation of the lower abdomen, I would carry on the project from their output 10 images. I would try to use the FFT results to get rid of the bigger sized unwanted group of pixels. (This is because FFT proved to serve as good estimator of the boundaries.) Also, applying the Fast Fourier Transform to smaller sub-images (e.g.: 8 x 8), might result in even finer details. After obtaining the continuous boundary line of the lower abdomen, I would restrict my image processing window size to one that just covers the indicated region. (In this way I could save a lot of unnecessary CPU calculations on the background.) Then I would start analyzing the inner region using the very initial values. Throughout that process I might be able to incorporate the 2nd step pyramid image, too. Carrying out calculations on that level, especially with the reduced region, would be extremely fast. On the higher resolution image representatives, I would only have to refine the results. After locating the lesions on the examined area, I would have to count the pixels belonging to these damaged areas. 11