The sections of this chapter are arranged as follows:
Section 5.1 introduces the main contents of the study and application of texture, as well as the basic principles and methods of texture analysis and classification.
Section 5.2 describes the statistical methods for texture description. In addition to the commonly used gray-level co-occurrence matrix and the texture descriptors based on co-occurrence matrix, energy-based texture descriptors are introduced.
Section 5.3 discusses the structural approach to texture description. The basic method includes two components: the texture primitives and the arrangement rules. Texture mosaic is a typical method, and in recent years, the local binary pattern (LBP) has also been used.
Section 5.4 focuses on the methods of describing texture using spectrum. Fourier spectrum, Bessel–Fourier spectrum, and Gabor spectrum are discussed.
Section 5.5 introduces the idea and method of texture segmentation. Both supervised texture segmentation and the unsupervised texture segmentation are discussed. A method using wavelet for texture classification is also presented.
Texture is commonly found in natural scenes, particularly in outdoor scenes containing both natural and man-made objects. Sand, stones, grass, leaves, and bricks all create a textured appearance in an image. Textures must be described by more than just their object classifications (Shapiro, 2001).
The word “texture” has a rich range of meanings. Webster’s dictionary defines texture as “the visual or tactile surface characteristics and appearance of something.” As an illustration, consider the texture of a reptile skin. Apart from being smooth (the tactile dimension), it also has cellular and specular markings on it that form a well-defined pattern (the visual dimension).
Texture has different meanings in different application areas. In the image community, texture is a somewhat loosely defined term (Shapiro 2001) and is considered a phenomenon that is widespread, easy to recognize, and hard to define (Forsyth, 2003). In fact, a number of explanations/definitions exist. Such as “texture is characterized by tonal primitive properties as well as spatial relationships between them (Haralick, 1992),” “texture is a detailed structure in an image that is too fine to be resolved, yet coarse enough to produce a noticeable fluctuation in the gray levels of the neighboring cells” (Horn, 1986). A more general view (Rao, 1990) considers texture to be “the surface markings or a 2-D appearance of a surface.”
Although no formal definition of texture exists, intuitively it provides measurements of properties such as smoothness, coarseness, and regularity (Gonzalez, 2002). Texture gives us information about the spatial arrangement of the color or intensities in an image (Shapiro, 2001). Many images contain regions characterized not so much by a unique value of brightness, but by a variation in brightness that is often called texture. It refers to the local variation in brightness from one pixel to the next or within a small region (Russ, 2002).
In discussing texture, one important factor is scale. Typically, whether an effect is referred to as texture or not depends on the scale at which it is viewed. A leaf that occupies most of an image is an object, but the foliage of a tree is a texture (Forsyth, 2003). For any textural surface, there exists a scale at which, when the surface is examined, it appears smooth and texture-less. Then, as resolution increases, the surface appears as a fine texture and then a coarse one. For multiple-scale textural surfaces, the cycle of smooth, fine, and coarse may repeat.
Texture can be divided into microtexture and macrotexture based on the scale (Haralick, 1992). When the gray-level primitives are small in size and the spatial interaction between gray-level primitives is constrained to be local, the resulting texture is a microtexture. The simplest example of a microtexture occurs when independent Gaussian noise is added to each pixel’s value in a smooth gray-level area. As the Gaussian noise becomes more correlated, the texture becomes more of a microtexture. Finally, when the gray-level primitives begin to have their own distinct shape and regular organization (identifiable shape properties), the texture becomes a macrotexture.
In addition, from the scale point of view, gray level and texture are not independent concepts. They bear an inextricable relationship to each other very much as a particle and a wave do. Whatever exists has both particle and wave properties, and depending on the situation, either particle or wave properties may predominate. Similarly, in the image context, both gray level and texture are present, although at times one property can dominate the other, and it often tends to speak of only the gray level or only the texture. Hence, when explicitly defining gray level and texture, it is not defining two concepts but one gray-level–texture concept.
From the research point of view, there are four broad categories of work to be considered:
This category is related to the identification and description of 2-D texture patterns.
This category is concerned with using texture as a means to perform segmentation of an image, that is, to break off an image into components within which the texture is constant.
This category seeks to construct large regions of texture from small example images.
4.Shape from texture
This category involves recovering surface orientation or surface shape from image texture, or using texture as a cue to retrieve information about surface orientation and depth.
From the application point of view, texture analysis plays an important role in many areas, such as
1.Distinguish hill from mountain, forest from fruits, etc., in remote sensing applications.
2.Inspect surfaces in the manufacturing of, for example, semiconductor devices.
3.Perform microanalysis for the nucleolus of cells, isotropy, or anisotropy of material, etc.
4.Recognize specimens in petrography and metallography.
5.Study metal deformation based on the orientations of its grains.
6.Visualize flow motion in biomedical engineering, oceanography, and aerodynamics.
Technically, many approaches for texture analysis have been proposed. Three principal approaches used to describe the texture of a region are statistical, structural, and spectral (Gonzalez, 2002).
They yield characterizations of textures as smooth, coarse, grainy, and so on. In statistical approaches, texture is a quantitative measurement of the arrangement of intensities in a region (Shapiro, 2001). The statistical model usually describes texture by statistical rules governing the distribution and relation of gray levels. This works well for many natural textures that have barely discernible primitives (Ballard, 1982). The goal of a statistical approach is to estimate parameters of some random process, such as fractal Brownian motions, or Markov random fields, which could have generated the original texture (Russ, 2002).
They deal with the arrangement of image primitives. In a structural approach, texture is a set of primitive texels in some regular or repeated relationship (Shapiro, 2001). Structural approaches try to describe a repetitive texture in terms of the primitive elements and placement rules that describe geometrical relationships between these elements (Russ, 2002).
They are based on properties of the Fourier spectrum and are used primarily to detect global periodicity in an image by identifying high-energy, narrow peaks in the spectrum.
Depending on the scale, different approaches have to be used. The basic interrelationships in the gray-level–texture concept are the following Haralick (1992). When a small-area patch of an image has little variation of gray-level primitives, the dominant property of that area is gray level. When a small-area patch has a wide variation of gray-level primitives, the dominant property is texture. Crucial in this distinction are the size of small-area patches, the relative sizes and types of gray-level primitives, and the number and placement or arrangement of the distinguishable primitives. As the number of distinguishable gray-level primitives decreases, the gray-level properties will predominate. In fact, when the small-area patch is only a pixel, so that there is only one discrete feature, the only property is simply the gray-level. As the number of distinguishable gray-level primitive increases within the small-area patch, the texture property will predominate. When the spatial pattern in the gray-level primitives is random and the gray-level variation between primitives is wide, a fine texture is available. When the spatial pattern becomes more definite and the gray-level regions involve more and more pixels, a coarser texture is available.
In Figure 5.1, the pattern in (a) consists of many small texture elements. It is better analyzed statistically without regard to the texture elements. The pattern in (b) consists of large texture elements. It is better analyzed structurally based on the texture elements. The pattern in (c) consists of many small texture elements that form local clusters. The clusters are better detected statistically by image segmentation without regard to the texture elements, and the pattern is better analyzed structurally based on the clusters. The pattern in (d) consists of large texture elements that form local clusters. The clusters are better detected structurally by grouping texture elements, and the pattern is better analyzed structurally based on the clusters.
Statistical approaches can be connected to statistical pattern recognition paradigms that divide the texture problem into two phases: training and testing. During a training phase, feature vectors from known samples are used to partition the feature space into regions representing different classes. During a testing phase, the feature-space partitions are used to classify feature vectors from unknown samples (Ballard, 1982).
A co-occurrence matrix can be computed the following way (Gonzalez, 2002). Let P be a position operator and C be a K × K matrix, whose element cij is the number of times that points with gray level gi occur relative to points with gray level gj, with 1 ≤ i, j ≤ K, and K is the number of distinct gray levels in the image.
More generally, let S be a set of pixel pairs that have a spatial relationship, and the element of gray-level co-occurrence matrix is defined by Haralick (1992)
The numerator on the right side of the equation is the number of pixel pairs that have a spatial relationship and have gray values g1 and g2, respectively. The denominator on the right side of the equation is the total number of pixel pairs. Thus, the obtained gray-level co-occurrence matrix C is normalized.
Consider an image with three gray levels, g1 = 0, g2 = 1, and g3 = 2, as follows:
where c11 (top left) is the number of times that a point with level z1 = 0 appears at the location one pixel below and to the right of a pixel with the same gray level, and c13 (top right) is the number of times that a point with level g1 = 0 appears at the location one pixel below and to the right of a point with gray level g3 = 2. The size of C is determined by the number of distinct gray levels in the input image. Thus, application of the concepts discussed here usually requires that intensities be re-quantized into a few gray-level bands in order to keep the size of C manageable.
Different images can have different co-occurrence matrices as they have different texture scales. This is the basis for using a gray-level co-occurrence matrix to work out texture descriptors. Figure 5.2 shows a texture image with a small scale and several of its gray-level co-occurrence matrices. Figure 5.3 shows a texture image with a large scale and several of its gray-level co-occurrence matrices. In both groups of figures, (a) is the original image, and (b)–(e) is the gray-level cooccurrence matrices C(1,0), C(0,1), C(1,–1), and C(1,1), respectively. Comparing the two figures, it can be seen that as the gray-level changes fast in space for the image with a small scale, the entries of co-occurrence matrices are quite interspersed. On the contrary, the entries of co-occurrence matrices for the image with a large scale are concentrated around the main diagonal axis. The reason is that for the image with a large scale, the pixels (close to each other) in a pair would have similar gray-level values. In fact, the gray- level co-occurrence matrix shows the spatial information of the relative positions of different gray-level pixels.
A simple generalization of the primitive gray-level co-occurrence approach is to consider more than two pixels at a time. This is called the generalized gray-level spatial dependence model for textures (Haralick, 1992).
Given a specific kind of spatial neighborhood and a subimage, it is possible to parametrically estimate the joint probability distribution of the gray levels over the neighborhoods in the subimage. The prime candidate distribution for the parametric estimation is a multivariate normal distribution. If x1,..., xN represent the N K-normal vectors coming from the neighborhoods in a sub-image, then the mean vector μ and covariance matrix V can be estimated by
where 1 is a column vector whose components are all of the value 1.
To define the concept of generalized co-occurrence, it is necessary to decompose an image into its primitives first. Let Q be the set of all primitives on the image. Then, primitive properties, such as the mean gray level, the variance of gray levels, the region size, and the region shape, will be measured. Let T be the set of primitive properties and f be a function that assigns a property of T to each primitive in Q. Finally, it needs to specify a spatial relation between primitives, such as the distance or adjacency. Let S ⊆ Q × Q be the binary relation pairing all primitives that satisfy the spatial relation. The element of generalized co-occurrence matrix Cg is defined by
where cg(t1, t2) is just the relative frequency with which two primitives occur with specified spatial relationships in the image, one primitive having the property t1 and the other having the property t2.
To analyze a given C matrix in order to categorize the texture of the region over which C was computed (C has captured properties of a texture but is not directly useful for further analysis, such as comparing two textures), many derived descriptors have been defined. A few examples are shown as follows:
1.Texture uniformity (the highest value is achieved when the cijs are all equal, which is called the second-order moment)
2.Texture entropy (it is a measurement of randomness, achieving its highest value when all elements of C are maximally random)
3.Element difference moment of order k (it has a relatively low value when the high values of C are near the main diagonal)
It is called texture contrast when k equals 1.
4.Inverse element difference moment of order k (it has a relatively high value when the high values of C are near the main diagonal)
To avoid the problem caused by i = j, an alternative definition used for k equaling 1, called texture homogeneity, is given by (d is a positive constant)
Figure 5.4 shows five texture images. Their values of texture uniformity, entropy, contrast, and homogeneity are given in Table 5.1. It can be seen that the value of d has important influence on texture homogeneity.
Then, the following 14 descriptors have been defined (Haralick, 1992; Russ 2002).
1.Uniformity of energy (angular second momentum)
4.Cluster tendency (sum of squares variance)
5.Inverse difference moment (Homogeneity)
12.Information measurements of correlation 1
13.Information measurements of correlation 2
14.Maximum correlation coefficient
This technique is based on the texture energy and uses local masks to measure the amount of variation within a fixed-size window. The image is first convolved with a variety of kernels. If f(x, y) is the input image and M1, M2,...,MN are the kernels (masks), the images gn = f*Mn, n = 1, 2,..., N are computed. Then, each convolved image is processed with a nonlinear operator to determine the total textural energy in each pixel’s neighborhood. When the neighborhood is k × k, the energy image corresponding to the nth kernel is defined by
Associated with each pixel position (x, y), is a textural feature vector [T1(x, y)T2(x, y)…TN(x, y)]T.
The textural energy approach is very much in the spirit of the transform approach, but it uses smaller windows or neighborhood support (applying a discrete orthogonal transform, such as DCT, locally to each pixel’s neighborhood is also usable). The commonly used kernels have supports for 3 × 3, 5 × 5, and 7 × 7 neighborhoods. Their 1-D forms are illustrated in the following formulas (L: level, E: edge, S: shape, W: wave, R: ripple, O: oscillation).