Digital Image Study Notes--The Basics of Digital Image

an imaging model

We f(x, y) represent images with two-dimensional functions. The value at the space coordinate (x, y)is fa scalar, its physical meaning is determined by the image source, and its value is proportional to the energy radiated by the physical source.

The intervals [L_{min}, L_{max}]are called brightness levels (gray levels). In actual work, this type of interval is often expressed as [0, 1]or [0, C], which L = 0means black L = 1and white.

 

Two Image Sampling and Quantization

1. Basic concepts of sampling and quantization

For a continuous image, we need to convert it into digital form. xThe coordinates and coordinates of an image yare continuous, and its magnitude is also continuous. To digitize a function, the coordinates and magnitude of the function are sampled. Digitizing the coordinate value is called sampling (sampling), and digitizing the amplitude value is called quantization.

 

2. Digital Image Representation

Let denote a continuous graph function with f(x, y)the sum of two continuous variables .st

Assuming that a continuous image is sampled as a digital image f(x, y), the image contains Mrows and Ncolumns, where (x, y)are discrete coordinates.

In general, (x, y)the value of a digital image at any coordinate is denoted by f(x, y), where xand yare integers.

The part of the real plane spanned by the coordinates of the image is called the spatial domain, xand ythe sum is called the spatial variable or spatial coordinate.

f(x, y) = \begin{equation} \left [ \begin{array}{ccc} f(0,0) & f(0, 1) & ... \qquad f(0, N - 1) \\ f(1, 0) & f(1, 1) & ... \qquad f(1, N - 1) \\ .. & .. & .. \\ f(M - 1, 0) & f(M - 1, 1) & ...\qquad f(M - 1, N - 1) \end{array} \right ] \end{}

When an appreciable number of pixels in an image has high dynamic range, the image is said to have high contrast.

The number of bits required to store a digital image bis:

                                     b = MNk

M = NYes, the above formula becomes:

                                     b = N^2k

1 byte equals 8 bits, and 1 megabyte equals 10^6bytes.

When an image has 2^ka possible gray level, we usually call the image a " kbit image".

 

3. Linear indexing and coordinate indexing

The convention that the position of a pixel is given by its two-dimensional coordinates is called a coordinate index or subscript index.

A linear index widely used in image processing consists of a one-dimensional string of non-negative integers obtained by computing an (0, 0)offset to a coordinate.

There are mainly two types of linear indexes: one is image-based row scan; the other is image-based column scan;

 

4. Spatial resolution and grayscale resolution

Spatial resolution is a measure of the smallest discernible detail in an image.

Grayscale resolution refers to the smallest discernible change in grayscale.

Grayscale resolution usually refers to the number of bits used to quantize grayscale.

 

5. Image interpolation

Interpolation is commonly used in tasks such as image enlargement, reduction, rotation, and geometric correction.

Interpolation is the process of using known data to estimate unknown unknown values.

Bilinear interpolation , using the 4 nearest neighbors to calculate the gray level at a given location. Let(x, y)denote the coordinates of the position to be assigned a gray value, and letv(x, y)denote the gray value. Its formula is:

                                                                       v(x, y) = ax + by + cxy + d

In the above formula, the 4 coefficients can be (x, y)obtained by 4 unknown equations written by the 4 nearest neighbors of the point. The result of bilinear interpolation is much better than that of nearest neighbor interpolation, but the amount of calculation will increase accordingly.

Bicubic interpolation , which includes 16 nearest neighbors. Its formula is:

                                                                       v(x, y) = \sum_{i = 0}^{3} \sum_{j=0}^{3}a_{ij}x^{i}y^{j}

In the above formula, the 16 coefficients can be (x, y)obtained by 16 unknown equations written by the 16 nearest neighbor points of the point. In general, bicubic interpolation does preserve detail better than bilinear interpolation.

You can use more neighbors when interpolating, and there are sophisticated techniques using splines and wavelets with which you can get better results.

 

Some basic relationships between three pixels

1. Neighboring pixels of a pixel

(x, y)The pixel at coordinates phas 2 horizontal neighbors and 2 vertical neighbors whose coordinates are:

                                  (x + 1, y), (x - 1, y), (x, y + 1), (x, y - 1)

This group of pixels is called pthe 4-neighborhood, N_{4}(p)denoted by .

pThe coordinates of the 4 diagonally adjacent pixels of are:

                       (x + 1, y + 1), (x + 1, y - 1), (x - 1, y + 1), (x - 1, y - 1)

Use N_{D}(p)to express. These adjacent pixels and 4 neighborhoods are collectively referred to as p8 neighborhoods, N_{8}(p)denoted by .

pThe set of image locations of neighboring pixels of a point is called pthe neighborhood of . If a neighborhood contains p, then the neighborhood is called a closed neighborhood, otherwise the neighborhood is called an open neighborhood.

2. Adjacency, connectivity, regions and boundaries

Let V be the set of gray values ​​used to define the adjacency. In a binary image, V = { 1 } refers to the adjacency of pixels with a value of 1.

Three types of adjacency:

1. 4 adjacency. qWhen in a set N_{4}(p), the sum Vof two pixels pwith values ​​in qis 4-adjacent.

2. 8 Adjacency. qWhen in a set N_{8}(p), the sum Vof two pixels pwith values ​​in qis 8-adjacent.

3. m adjacency (mixed adjacency). If: (a)  qis N_{4}(p) in , or (b)  qis N_{D}(p)in , and there is no pixel N_{4}(p) \cap N_{4}(q)in the set with value in , then the sum of two pixels with value in is adjacent.VVpqm

Hybrid adjacency is an improvement of 8-adjacency, and its purpose is to eliminate the ambiguity that may be caused when 8-adjacency is used.

The numerical pathways (or curves) from (x_0, y_0)a pixel at coordinates pto (x_n, y_n)a pixel at coordinates qare distinct sequences of pixels whose coordinates are:

                                                             (x_0, y_0), (x_1, y_1), ..., (x_n, y_n)

In the formula, the points (x_i, y_i)and (x_{i - 1}, y_{i - 1})at  1 \the i \the nare adjacent. nis the length of the path. (x_0, y_0) = (x_n, y_n), the path is a closed path.

 

Let Sdenote a subset of pixels in the image. A sum is said to be connected in Sif there exists a path between ptwo pixel sums consisting entirely of all pixels in . For any pixel in , the set of pixels connected to the pixel in is called connected components of . If there is only one connected component, the set is called a connected set.qpqSSpSSSS

Let Rdenote a subset of pixels in the image. If it isR a connected set, it Ris called a region of the image. When the sum of two regions R_iforms R_ja connected set, the R_isum is R_jcalled an adjacent region. Regions that are not contiguous are called disjoint regions.

2. Distance measure

For pixels with coordinates (x, y), (u, v)and and , respectively, if:(w, z)pqs

D(p, q) \geq 0  [ D(p, q) = 0, if and only if p = q]

D(p, q) = D(q, p) and

D(p, s) \le D(p, q) + D(q, s)

is Da distance function or distance measure. The Euclidean distance between pand is defined as:q

                             D_{e}(p, q) = [(x - u)^2 + (y - v)^2]^{\frac {1}{2}}

For this distance measure, (x, y)the distance in rpixels to a point is less than or equal to that of a disk (x, y)with center at and radius r.

pqThe distance between and D_4is defined as:

                                           D_4(p, q) = |x - u| + |y - v|

At this time, the pixels (x, y)whose distance D_4is less than or equal to form a rhombus with a center of .d(x, y)

 

pqThe distance between and D_8is defined as:

                                           D_8(p, q) = max(|x - u|, |y - v|)

At this time, the pixels with (x, y)a distance of D_8less than or equal to form a square with a center of .d(x, y)

3. Spatial operations

Divided into three categories:

  1. Single pixel operation;
  2. Neighborhood operation;
  3. Geometric space transformation;

 

single pixel operation

Use a transform function Tto change the grayscale of individual pixels in an image:

                                s = T(z)

In the formula, zis the gray level of the pixel in the original image, sand is the gray level of the corresponding pixel in the processed image.

 

Neighborhood operation

Let be the set of coordinates representing a neighborhood centered on an arbitrary point in S_{xy}the image . Neighborhood processing produces a corresponding pixel at the same coordinates in the output image, whose value is determined by the prescribed operation of the neighborhood pixels in the input image and the coordinates in the set. The average calculation formula is:f(x, y)gS_{xy}

                                                                   g(x, y) = \frac {1}{mn} \sum_{(r, c) \in S_{xy} }f(r, c)

where rand care the row and column coordinates of the pixels that belong to the set S_{xy}.

 

geometric transformation

Geometric transformations change the spatial arrangement of pixels in an image.

The geometric transformation of digital images consists of two basic operations:

  1. Coordinate space transformation;
  2. Grayscale interpolation, which is to assign grayscale values ​​to pixels after space transformation;

The coordinate transformation can be expressed as:

                      \left [ \begin{matrix} x' \\ y' \end{matrix} \right] = T \left [ \begin{matrix} x \\ y \end{matrix} \right] = \left[ \begin{matrix} t_{11} & t_{12} \\ t_{21} & t_{22} \end{matrix} \right ] \left[ \begin{matrix} x \\ y \end{matrix} \right ]

The key property of a two-dimensional affine transformation is that it preserves points, lines and planes.

Using a 3 x 3 matrix, it is possible to represent all 4 affine transformations in homogeneous coordinates:

                    \left[ \begin{matrix} x' \\ y' \\ 1 \end{matrix} \right ] = A \left[ \begin{matrix} x \\ y \\ 1 \end{matrix} \right ] = \left[ \begin{matrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ 0 & 0 & 1 \end{matrix} \right ] \left [ \begin{matrix} x \\ y \\ 1 \end{matrix} \right ]

This transformation scales, rotates, translates, or shears the image according to the element values ​​chosen for matrix A.

 

 

4. Image transformation

The two-dimensional linear transformation expressed as T(u, v)is a particularly important transformation, and its general formula is:

                       T(u, v) = \sum_{x=0}^{M - 1} \sum_{y=0}^{N-1}f(x, y)e^{-j2\pi(ux / M + vy / N)}

where f(x, y)is the input image, r(x, y, u, v)called the forward transform kernel. (x and y are spatial variables, M and N are fthe number of rows and columns, uand vare called transformation variables). T(u, v)called f(x, y)the direct transformation.

 

Known T(u, v), we use T(u, v)the inverse transformation to restore f(x, y):

                      f(x, y) = \sum_{u = 0}^{M - 1} \sum_{v = 0}^{N - 1} T(u, v)s(x, y, u, v)

In the formula, x = 0, 1, 2, ..., M - 1, y = 0, 1, 2, ..., N - 1, s(x, y, u, v)is called the inverse transformation kernel.

The nature of the transformation depends on the transformation kernel.

A transform of particular importance in digital image processing is the Fourier transform, which has the following forward and inverse kernels:

                             r(x, y, u, v) = e^{-j2 \pi (\frac {ux} {M} + \frac {vy} {N})}

and

                             s(x, y, u, v) = \frac {1} {MN} e^{j2 \pi (\frac {ux} {M} + \frac {vy} {N})}

where , j = \sqrt{-1}so these transformation kernels are complex functions. The discrete Fourier transform pair can be obtained:

                           T(u, v) = \sum_{x=0}^{M-1} \sum_{y=0}^{N-1} f(x, y)e^{-j2 \pi (\frac {ux}{M} + \frac {vy} {N})}

and

                          f(x, y) = \frac {1} {MN} (\sum_{u=0}^{M-1} \sum_{v=0}^{N-1}T(u, v)e^{j2 \pi (ux / M + vy / N)})

 

When the forward and inverse transformation kernels are divisible, symmetrical, and f(x, y)a square image with a size of M x M, it can be expressed in matrix form:

                                                 T = AFA

In the formula, Fis f(x, y)the M x M matrix of the elements contained; is the M x M matrix of A the elements , and is the M x M transformation matrix of the elements .a_{ij} = r_1(i, j)TT(u, v), u, v = 0, 1, 2, ..., M - 1

To get the inverse transformation, we Bpre- and post-multiply the inverse transformation matrix:

                                                  BTB = BAFAB

if B = A^{-1}then

                                                F = BTB

This formula shows that For equivalently f(x, y)can be fully restored by its forward transformation. If then B \neq A^{-1}:

                                              \hat F = BAFAB

5. Image grayscale and random variables

The probability of a gray level z_kappearing in this image p(z_k)is calculated as:

                               p(z_k) = \frac {n_k} {MN}

In the formula, n_kis the number of gray levels z_kappearing in the image, and MN is the total number of pixels. Obviously there are:

                                 \sum_{k =0}^{L-1}p(z_k) = 1

The mean gray level is: 

                                 m = \sum_{k=0}^{L - 1} z_k p(z_k)

Similarly, the variance of grayscale is:

                                \sigma ^ 2 = \sum_{k=0}^{L-1} (z_k - m)^2 p(z_k)

 

In general, the nth order central moment of a random variable z with respect to the mean is defined as:

                                \mu_n (z) = \sum_{k = 0}^{L - 1} (z_k - m)^n p(z_k)

 

Guess you like

Origin blog.csdn.net/jcl314159/article/details/116134572