Chapter 1 – Fundamentals of Digital Image Processing – Fundamentals of Digital Image Processing

Chapter 1

Introduction to Digital Image Processing

  • To introduce the basic steps in digital image processing.
  • To highlight the salient functions of various building blocks in digital image processing.
  • To illustrate how to apply image processing concepts in an autonomous system.
  • To write a program in C++ to read and display images.

Digital image processing methods were introduced in 1920, when people were interested in transmitting picture information across the Atlantic Ocean. The time taken to transmit one image of size 256 × 256 was about a week. The pictures were encoded using specialized printing equipment and were transmitted through the submarine cable. At the receiving end, the coded pictures were reconstructed. The reconstructed pictures were not up to the expected visual quality and the contents could not be interpreted due to the interference. Hence, the scientists and engineers who were involved in the transmission of picture information, started devising various techniques to improve the visual quality of the pictures. This was the starting point for the introduction of the image processing methods. To improve the speed of transmission the Bart lane cable was introduced and it reduced the transmission time of the picture information from 1 week to less than three hours. In the early stages, attempts to improve the visual quality of the received image were related to the selection of printing procedures and distribution of brightness levels. During the 1920s the coding of images involved five distinct brightness levels. In the year 1929, the number of brightness levels were increased to 15 and this improved the visual quality of the images received.

The use of digital computers for improving the quality of the images received from space probe began at the Jet Propulsion Laboratory in the year 1964. From 1964 until today the field of digital image processing has grown vigorously. Today, digital image processing techniques are used to solve a variety of problems. These techniques are used in two major application areas and they are

  1. Improvement of pictorial information for human interpretation
  2. Processing of scene of data for autonomous machine perception.

The following paragraphs give some of the application areas where image processing techniques are capable of enhancing pictorial information for human interpretation. In medicine, the digital image processing techniques are used to enhance the contrast or transform the intensity levels into color for easier interpretation of X-rays and other bio-medical images. The geographers will make use of the available image processing techniques to enhance the pollution patterns from aerial and satellite imagery.

Image enhancing techniques can be used to process degraded images of unrecoverable objects or experimental results too expensive to duplicate. In archeology, image processing techniques have successfully restored blurred pictures that were the only available records of rare artifacts lost or damaged after being photographed.

The following examples illustrate the digital image processing techniques dealing with problems in machine perception. The character recognition, industrial machine vision for product assembly and inspection, fingerprint processing, and weather prediction are some of the problems in machine perception that utilize the image processing techniques.


The various steps required for any digital image processing applications are listed below:

  1. Image grabbing or acquisition
  2. Preprocessing
  3. Segmentation
  4. Representation and feature extraction
  5. Recognition and interpretation.


Preprocessing: A process to condition/enhance the image in order to make it suitable for further processing.


It is more appropriate to explain the various steps in digital image processing with an application like mechanical components classification system. Let us consider an industrial application where the production department is involved in the manufacturing of certain mechanical components like bolts, nuts, and washers. Periodically, each one of these components must be sent to the stores via a conveyor belt and these components are dropped in the respective bins in the store room. The sequence of operations performed is illustrated in Figure 1.1.

In the image acquisition step using the suitable camera, the image of the component is acquired and then subjected to digitization. The camera used to acquire the image can be a monochrome or color TV camera which is capable of producing images at the rate of 25 images per sec.

The second step deals with the preprocessing of the acquired image. The key function of preprocessing is to improve the image such that it increases the chances for success of other processes. In this application, the preprocessing techniques are used for enhancing the contrast of the image, removal of noise and isolating the objects of interest in the image.

The next step deals with segmentation—a process in which the given input image is partitioned into its constituent parts or objects. The key role of segmentation in the mechanical component classification is to extract the boundary of the object from the background. The output of the segmentation stage usually consists of either boundary of the region or all the parts in the region itself. The boundary representation is appropriate when the focus is on the external shape and regional representation is appropriate when the focus is on the internal property such as texture. The application considered here needs the boundary representation to distinguish the various components such as nuts, bolts, and washers.



FIGURE 1.1 Block diagram of the component classification system


In the representation step the data obtained from the segmentation step must be properly transformed into a suitable form for further computer processing. The feature selection deals with extracting salient features from the object representation in order to distinguish one class of objects from another. In terms of component recognition the features such as the inner and the outer diameter of the washer, the length of the bolt, and the length of the sides of the nut are extracted to differentiate one component from another.


Feature Extraction: A process to select important characteristics of an image or object.


The last step is the recognition process that assigns a label to an object based on the information provided by the features selection. Interpretation is nothing but assigning meaning to the recognized object. The various steps discussed so far are depicted in the schematic diagram as shown in Figure 1.2. We have not yet discussed about the prior knowledge or the interaction between the knowledge base and the processing modules.

Knowledge about the problem domain is coded into the image processing system in the form of knowledge database. This knowledge is as simple as describing the regions of the image where the information of interest is located. Each module will interact with the knowledge base to decide about the appropriate technique for the right application. For example, if the acquired image contains spike-like noise the preprocessing module interacts with the knowledge base to select an appropriate smoothing filter-like median filter to remove the noise.



FIGURE 1.2 Fundamental steps in digital image processing system


The major building blocks of a digital image processing system (see Figure 1.3) are as follows:

  1. Acquisition
  2. Storage
  3. Processing
  4. Display and communication interface.

1.3.1 Image Acquisition

In order to acquire a digital image, a physical device sensitive to a band in the electromagnetic energy spectrum is required. This device converts the light (X-rays, ultraviolet, visible, or infrared) information into corresponding electrical signal. In order to convert this electrical signal into digital signal another device called digitizer is employed. Among the many devices most frequently used, devices to sense the visible and infrared lights are microdensitometer, Vidicon camera, and solid-state arrays.


Digitizer: A device which converts electrical signal into digital signal.



FIGURE 1.3 Basic building blocks of digital image processing system


Microdensitometer   In the microdensitometer the photograph or film is mounted on a flat bed or wrapped around a drum. An electron beam of light emitted from an electron gun is then used to scan the photograph and simultaneously the bed is translated or the drum is rotated in order to scan the entire photograph or film. In the case of a film, the beam passing through it is made to fall on the sensor kept below the film (photo sensor). Then the sensor will produce the corresponding electrical signal. In the case of a photograph, the electron beam which is reflected from the surface of the image is focused on a photo detector kept above the photograph. Then the sensor converts the reflected electron beam into the electrical signal. The microdensitometers are slow devices and capable of producing high-resolution digital images.

Vidicon camera   The basic principle of operation of the Vidicon camera is based on photoconductivity. When the camera is focused on any image, a pattern of varying conductivity corresponding to the distribution of the brightness in the image is formed on the Vidicon camera tube surface. An electron beam then scans the surface of the photoconductive target and by charge neutralization this beam creates a potential difference on an electrode, which is proportional to the brightness pattern of the image. This potential difference is then quantized and the corresponding position of the scanning beam is noted. A digital image is formed using the electron beam position and the quantized signal values.

Solid-state arrays   The solid-state arrays consists of tiny silicon elements called photosites and these elements are capable of producing voltage output proportional to the intensity of the incident light. The solid-state arrays are organized in two ways:

  1. Line-scan sensor
  2. Area-scan sensors.


Photosite: A sensor which converts the light signal to electrical signal.


A line-scan sensor consists of a row of photosites, and a twodimensional image is formed by the relative motion between the scene and the detector (photosites). In an area sensor the photosites are arranged in the form of matrix and capable of capturing an image completely. The technology used in solid-state sensor is based on charged coupled devices (CCD). Figure 1.4 shows a typical line-scan CCD sensor consisting of a row of photosites and on either side of the photosites the transport registers are arranged.


CCD: (Charge Coupled Device) A sensor which holds electrical charge proportional to the light falling on it.



FIGURE 1.4 A CCD line-scan sensor


Two gates are used to clock the contents of the photosites into the transport registers. Then the output gate is used to clock the contents of the transport registers into an amplifier. The amplifier output voltage is proportional to the contents of the row of photosites. Charged coupled area sensors are similar to line-scan sensors except that the photosites are arranged in a matrix form. The arrangement of the area sensor is shown in Figure 1.5.



FIGURE 1.5 A CCD area-scan sensor


For each column of the photosites in the area sensor, a gate and a vertical transport register are coupled to it. A horizontal transport register and a gate are arranged at the top of the vertical transport register. The output of the transport register is passed into an amplifier through an output gate. First, the contents of odd numbered photosites are sequentially gated into the vertical transport register and then into the horizontal transport register. The content of this register is fed into an amplifier whose output is a line of video. Repeating this procedure for odd and even numbered line completes the scanning. The scanning rate is usually 25 or 30 times per second. Line scanners with resolutions ranging from 256 to 4096 elements are available. The area scanners of resolution ranging from 32 × 32 elements to 256 × 256 elements are commonly available. High-resolution sensor of the order 1024 × 1024 elements is also available at relatively affordable prices.


Gate: A logical circuit which allows the signal to flow when it is activated.

1.3.2 Storage

There are three different types of digital storage that are available for digital image processing applications. The first type of storage or memory used during processing is called short-term storage. For frequent retrieval of the images the second type of storage called online storage is employed. The third type of memory is called archival storage, characterized by infrequent access. One way of providing short-term memory is by using the main memory of the computer. Another way of implementing the short-term memory is using specialized boards called frame buffers. When the images are stored in frame buffers, they can be accessed rapidly at the rate of 30 images per second. The images in the frame buffers allow operations such as instantaneous image zoom, vertical shift (scroll), and horizontal shift (pan). Frame buffer cards are available to accommodate as many images as 32 (32 MB).

The online memory generally usesWinchester disks of capacity 1 GB. In the recent years magneto optical storage became popular. It uses a laser and specialized material to achieve few gigabytes of storage on 5″ optical disk. Since the online storage is used for frequent access of data, the magnetic tapes are not used. For large online storage capacity 32 to 100 optical disks are in a box and this arrangement is called Jukebox.

The archival storage is usually larger in size and is used for infrequent access. High-density magnetic tapes and Write–Once–Read–Many (WORM) optical disk is used for realizing the archival memory. Magnetic tapes are capable of storing 6.4KB per inch of image data and therefore to store one megabyte of image requires 13 ft of tape. WORM disks with a capacity to store 6GB on 12″ disk and 10GB on 14″ disk are commonly available. The lifetime of the magnetic tape is only 7 years, whereas forWORMdisk it is more than 30 years. TheWORMmemories are now available in a Jukebox.


WORM: (Write–Once–Read–Many) An optical disk memory.

1.3.3 Processing

The processing of images need a specialized hardware consisting of a high-speed processor. This processor is totally different from the conventional processor available in a computer. The processor and the associated hardware is realized in the form of a card called image processor card. The processor in the cards is capable of processing the data of different word size. For example, the image processor card IP-8 is capable of processing word size of 8 bits. The image processor card usually consists of a digitizer, a frame buffer, the arithmetic and logical unit (ALU), and the display module.

The digitizer is nothing but an analog to digital converter to convert the electrical signal corresponding to the intensities of the optical image into a digital image. There may be one or more frame buffers for fast access to image data during processing. The ALU is capable of performing the arithmetic and logical operations at frame rate. Suitable software comes along with the image processor card to realize various image processing techniques/algorithms.

1.3.4 Display and Communication Interface

Black and white and color monitors are used as display devices in the image processing system. These monitors are driven by the output signals from the display module, which is available in the image processor card. The signals of the display module can also be given to the recording device that produces the hard copy of the image being viewed on the monitor screen. The other display devices include dot matrix printer and laser printer. The image display devices are useful for low-resolution image processing works.

The communication interface is quite useful to establish communication between image processing systems and remote computers. Suitable hardware and software are available for this purpose. Different types of communication channels or media are available for extension of image data. For example, a telephone line can be used to transmit an image at a maximum rate of 9600 bits/sec. Fiber optic links, microwave links, and satellite links are much faster and cost considerably more.


This chapter is representative of the fundamentals of digital image processing. It starts with an introduction and general steps involved in image processing, which are discussed with a practical illustrative component classification system as an example. The steps detailed are the various branching fields of image processing. Therefore, the fundamentals of the basic image processing techniques are covered. For example, image compression, image restoration, etc. are dynamic fields where new techniques and applications are applied. The topics covered in this chapter form the basis for the forthcoming chapters.

The main objective of this chapter is to make the reader aware of the various image acquisition systems available in the market, how to store the images and process them.

A few questions are given at the end of the chapter to improve the intutive capabilities of the students. In order to carry out practical experiments in digital image processing or to implement simple image processing applications, one has to know how to write programs in any one of the higher level languages like C or VC++ to read an image and display it. To simplify this task, a VC++ program is also given in Appendix I which the students can use for reading and displaying bitmap images.

Review Questions

Short Type Questions

  1. What is the speed at which the image acquisition unit produces the image?
  2. State the steps involved in digital image processing.
  3. What is frame buffer? State its important characteristics.
  4. What is WORM?
  5. What is the storage size required to store a monochrome image of size 256 × 256?
  6. With a neat block diagram, explain the various steps involved in digital image processing.

Descriptive Type Questions

  1. Write a note on
    1. Line scanner
    2. Area scanner.
  2. Explain the various building blocks of a digital image processing system.
  3. Write programs to read images in different format like TIFF, GIF, BMP, and JPEG and display the same.

Sample VC++ Program to Read and Display Images


imgdispView.cpp:implementation of the CImgdispView class



#include "stdafx.h"
#include "imgdisp.h"


#include "imgdispDoc.h"
#include "imgdispView.h"
#ifdef _DEBUG
#define new DEBUG_NEW
#undef THIS_FILE
static char THIS_FILE[] = __FILE__;


#include "fstream.h"
WORD getint(ifstream f)

       return (f.get()|(f.get()<<8));

DWORD getlint(ifstream f)

       return f.get()|f.get()<<8|f.get()<<16|f.get()<<24;

       WORD bfType;
       DWORD bfSize;
       DWORD bfRes;
       DWORD bfOffset;

       DWORD biSize;
       DWORD biHeit;
       DWORD biWidth;
       WORD biPlanes;
       WORD biBitCount;
       DWORD biCompression;
       DWORD biImgSize;
       DWORD biXPelsm;
       DWORD biYPelsm;
       DWORD biColorsUsed;
       DWORD biImpColors;

struct RGBQ
       BYTE r;
       BYTE g;
       BYTE b;
       BYTE res;

struct PIXEL
       BYTE r;
       BYTE g;
       BYTE b;

struct BITMAP1
       RGBQ *palette;



              // NOTE - the ClassWizard will add and remove mapping
          macros here.

              // DO NOT EDIT what you see in these blocks of
          generated code!

       // Standard printing commands
       ON_COMMAND(ID_FILE_PRINT, CView::OnFilePrint)
       ON_COMMAND(ID_FILE_PRINT_PREVIEW, CView::OnFilePrintPreview)

// CImgdispView construction/destruction

                    // TODO: add construction code here

CImgdispView:: ~CImgdispView()

BOOL CImgdispView::PreCreateWindow(CREATESTRUCT& cs)
       // TODO: Modify the Window class or styles here by modifying
       // the CREATESTRUCT cs
       return CView::PreCreateWindow(cs);

//CImgdispView drawing

void CImgdispView::OnDraw(CDC* pDC)
       CImgdispDoc* pDoc = GetDocument();
       // TODO: add draw code for native data here
       BITMAP1 bmp;
       ifstream fin;
       UCHAR blksz = 10;"C:\\1.bmp");
              MessageBox("File Open Error");
       bmp.fh.bfType =getint(fin);
       bmp.fh.bfSize =getlint(fin);
       bmp.fh.bfRes =getlint(fin);
       bmp.fh.bfOffset =getlint(fin);

       bmp.ih.biSize = getlint(fin);
       bmp.ih.biWidth = getlint(fin);
       bmp.ih.biHeit = getlint(fin);
       bmp.ih.biPlanes = getint(fin);
       bmp.ih.biBitCount = getint(fin);
       bmp.ih.biCompression = getlint(fin);
       bmp.ih.biImgSize = getlint(fin);
       bmp.ih.biXPelsm= getlint(fin);
       bmp.ih.biYPelsm = getlint(fin);
       bmp.ih.biColorsUsed = getlint(fin);
       bmp.ih.biImpColors = getlint(fin);
       DWORD xbyte = bmp.ih.biWidth*bmp.ih.biBitCount/8;
       DWORD diff = bmp.ih.biImgSize/bmp.ih.biHeit-xbyte;
       BYTE ch;
       DWORD off = 0;
       PIXEL (*bitmap)[1000] = new PIXEL[1000][1000];

  /***** READ PALETTE *****/
              bmp.palette = new RGBQ[1<<bmp.ih.biBitCount];
              for(WORD i=0;i<(1<<bmp.ih.biBitCount);i++)
                     bmp.palette[i].b = fin.get();
                     bmp.palette[i].g = fin.get();
                     bmp.palette[i].r = fin.get();
                     bmp.palette[i].res = fin.get();
                            MessageBox("read error");

  /***** DRAWING WITH PALETTE *****/

              WORD n = 8/bmp.ih.biBitCount;
              for(WORD row = 0;row < bmp.ih.biHeit;row++)
                     for(WORD col = 0;col < xbyte ;col++)
                            MessageBox("READ FAILED");
                            goto out;
                         for(WORD pix = 0;pix < n;pix++)
                            BYTE disp = (ch>>(bmp.ih.biBitCount*
                            pDC->SetPixel(col*n+pix, bmp.ih.biHeit -
                                row, RGB(bmp.palette[disp].r,
          for(int m=diff;m>0;m--) fin.get();
                     out:delete []bmp.palette;


              for(WORD i=0;i<bmp.ih.biHeit;i++)
                     for(WORD j=0;j<bmp.ih.biWidth ;j++)
                                MessageBox("read error");
                                goto out2;
                            bitmap[j][bmp.ih.biHeit-i-1].b = fin.get();
                            bitmap[j][bmp.ih.biHeit-i-1].g = fin.get();
                            bitmap[j][bmp.ih.biHeit-i-1].r = fin.get();
                     for(DWORD m=diff;m>0;m--)

out2: fin.close();

            for(int x=0;x<bmp.ih.biHeit;x++)
              for(int y=0;y<bmp.ih.biWidth;y++)

// CImgdispView printing
BOOL CImgdispView::OnPreparePrinting(CPrintInfo* pInfo)
      // default preparation
      return DoPreparePrinting(pInfo);
void CImgdispView::OnBeginPrinting(CDC* /*pDC*/, CPrintInfo*/*pInfo*/)
      // TODO: add extra initialization before printing
void CImgdispView::OnEndPrinting(CDC* /*pDC*/, CPrintInfo*/*pInfo*/)
      // TODO: add cleanup after printing

// CImgdispView diagnostics

#ifdef _DEBUG
void CImgdispView::AssertValid() const
void CImgdispView::Dump(CDumpContext& dc) const

CImgdispDoc* CImgdispView::GetDocument() // non-debug version is inline
      return (CImgdispDoc*)m_pDocument;
#endif //_DEBUG

// CImgdispView message handlers

Sample output of the program is given below: