an imaging model
We represent images with two-dimensional functions. The value at the space coordinate
is
a scalar, its physical meaning is determined by the image source, and its value is proportional to the energy radiated by the physical source.
The intervals are called brightness levels (gray levels). In actual work, this type of interval is often expressed as
or
, which
means black
and white.
Two Image Sampling and Quantization
1. Basic concepts of sampling and quantization
For a continuous image, we need to convert it into digital form. The coordinates and coordinates of an image
are continuous, and its magnitude is also continuous. To digitize a function, the coordinates and magnitude of the function are sampled. Digitizing the coordinate value is called sampling (sampling), and digitizing the amplitude value is called quantization.
2. Digital Image Representation
Let denote a continuous graph function with the sum of two continuous variables .
Assuming that a continuous image is sampled as a digital image , the image contains
rows and
columns, where
are discrete coordinates.
In general, the value of a digital image at any coordinate is denoted by
, where
and
are integers.
The part of the real plane spanned by the coordinates of the image is called the spatial domain, and
the sum is called the spatial variable or spatial coordinate.
When an appreciable number of pixels in an image has high dynamic range, the image is said to have high contrast.
The number of bits required to store a digital image is:
Yes, the above formula becomes:
1 byte equals 8 bits, and 1 megabyte equals bytes.
When an image has a possible gray level, we usually call the image a "
bit image".
3. Linear indexing and coordinate indexing
The convention that the position of a pixel is given by its two-dimensional coordinates is called a coordinate index or subscript index.
A linear index widely used in image processing consists of a one-dimensional string of non-negative integers obtained by computing an offset to a coordinate.
There are mainly two types of linear indexes: one is image-based row scan; the other is image-based column scan;
4. Spatial resolution and grayscale resolution
Spatial resolution is a measure of the smallest discernible detail in an image.
Grayscale resolution refers to the smallest discernible change in grayscale.
Grayscale resolution usually refers to the number of bits used to quantize grayscale.
5. Image interpolation
Interpolation is commonly used in tasks such as image enlargement, reduction, rotation, and geometric correction.
Interpolation is the process of using known data to estimate unknown unknown values.
Bilinear interpolation , using the 4 nearest neighbors to calculate the gray level at a given location. Letdenote the coordinates of the position to be assigned a gray value, and let
denote the gray value. Its formula is:
In the above formula, the 4 coefficients can be obtained by 4 unknown equations written by the 4 nearest neighbors of the point. The result of bilinear interpolation is much better than that of nearest neighbor interpolation, but the amount of calculation will increase accordingly.
Bicubic interpolation , which includes 16 nearest neighbors. Its formula is:
In the above formula, the 16 coefficients can be obtained by 16 unknown equations written by the 16 nearest neighbor points of the point. In general, bicubic interpolation does preserve detail better than bilinear interpolation.
You can use more neighbors when interpolating, and there are sophisticated techniques using splines and wavelets with which you can get better results.
Some basic relationships between three pixels
1. Neighboring pixels of a pixel
The pixel at coordinates
has 2 horizontal neighbors and 2 vertical neighbors whose coordinates are:
This group of pixels is called the 4-neighborhood,
denoted by .
The coordinates of the 4 diagonally adjacent pixels of are:
Use to express. These adjacent pixels and 4 neighborhoods are collectively referred to as
8 neighborhoods,
denoted by .
The set of image locations of neighboring pixels of a point is called
the neighborhood of . If a neighborhood contains
, then the neighborhood is called a closed neighborhood, otherwise the neighborhood is called an open neighborhood.
2. Adjacency, connectivity, regions and boundaries
Let V be the set of gray values used to define the adjacency. In a binary image, { 1 } refers to the adjacency of pixels with a value of 1.
Three types of adjacency:
1. 4 adjacency. When in a set
, the sum
of two pixels
with values in
is 4-adjacent.
2. 8 Adjacency. When in a set
, the sum
of two pixels
with values in
is 8-adjacent.
3. m adjacency (mixed adjacency). If: (a) is
in , or (b)
is
in , and there is no pixel
in the set with value in , then the sum of two pixels with value in is adjacent.
Hybrid adjacency is an improvement of 8-adjacency, and its purpose is to eliminate the ambiguity that may be caused when 8-adjacency is used.
The numerical pathways (or curves) from a pixel at coordinates
to
a pixel at coordinates
are distinct sequences of pixels whose coordinates are:
In the formula, the points and
at
are adjacent.
is the length of the path.
, the path is a closed path.
Let denote a subset of pixels in the image. A sum is said to be connected in
if there exists a path between
two pixel sums consisting entirely of all pixels in . For any pixel in , the set of pixels connected to the pixel in is called connected components of . If there is only one connected component, the set is called a connected set.
Let denote a subset of pixels in the image. If it is
a connected set, it
is called a region of the image. When the sum of two regions
forms
a connected set, the
sum is
called an adjacent region. Regions that are not contiguous are called disjoint regions.
2. Distance measure
For pixels with coordinates ,
and , and , respectively, if:
if and only if
and
is a distance function or distance measure. The Euclidean distance between
and is defined as:
For this distance measure, the distance in
pixels to a point is less than or equal to that of a disk
with center at and radius
.
The distance between and
is defined as:
At this time, the pixels whose distance
is less than or equal to form a rhombus with a center of .
The distance between and
is defined as:
At this time, the pixels with a distance of
less than or equal to form a square with a center of .
3. Spatial operations
Divided into three categories:
- Single pixel operation;
- Neighborhood operation;
- Geometric space transformation;
single pixel operation
Use a transform function to change the grayscale of individual pixels in an image:
In the formula, is the gray level of the pixel in the original image,
and is the gray level of the corresponding pixel in the processed image.
Neighborhood operation
Let be the set of coordinates representing a neighborhood centered on an arbitrary point in the image . Neighborhood processing produces a corresponding pixel at the same coordinates in the output image, whose value is determined by the prescribed operation of the neighborhood pixels in the input image and the coordinates in the set. The average calculation formula is:
where and
are the row and column coordinates of the pixels that belong to the set
.
geometric transformation
Geometric transformations change the spatial arrangement of pixels in an image.
The geometric transformation of digital images consists of two basic operations:
- Coordinate space transformation;
- Grayscale interpolation, which is to assign grayscale values to pixels after space transformation;
The coordinate transformation can be expressed as:
The key property of a two-dimensional affine transformation is that it preserves points, lines and planes.
Using a 3 x 3 matrix, it is possible to represent all 4 affine transformations in homogeneous coordinates:
This transformation scales, rotates, translates, or shears the image according to the element values chosen for matrix A.
4. Image transformation
The two-dimensional linear transformation expressed as is a particularly important transformation, and its general formula is:
where is the input image,
called the forward transform kernel. (x and y are spatial variables, M and N are
the number of rows and columns,
and
are called transformation variables).
called
the direct transformation.
Known , we use
the inverse transformation to restore
:
In the formula, ,
is called the inverse transformation kernel.
The nature of the transformation depends on the transformation kernel.
A transform of particular importance in digital image processing is the Fourier transform, which has the following forward and inverse kernels:
and
where , so these transformation kernels are complex functions. The discrete Fourier transform pair can be obtained:
and
When the forward and inverse transformation kernels are divisible, symmetrical, and a square image with a size of M x M, it can be expressed in matrix form:
In the formula, is
the M x M matrix of the elements contained; is the M x M matrix of
the elements , and is the M x M transformation matrix of the elements .
To get the inverse transformation, we pre- and post-multiply the inverse transformation matrix:
if then
This formula shows that or equivalently
can be fully restored by its forward transformation. If then
:
5. Image grayscale and random variables
The probability of a gray level appearing in this image
is calculated as:
In the formula, is the number of gray levels
appearing in the image, and MN is the total number of pixels. Obviously there are:
The mean gray level is:
Similarly, the variance of grayscale is:
In general, the nth order central moment of a random variable z with respect to the mean is defined as: