Image recognition technology OpenCV | C++ version

Getting Started

Image and Signal

image

Image is the material reproduction of human perception of vision. Images can be acquired by optical devices or artificially created. With the development of digital acquisition technology and signal processing theory, more and more images are stored in digital form. Thus, in some cases the term "image" actually refers to a digital image. Image-related topics include image acquisition, image production, image analysis, and image processing. Images are divided into still images and moving images. An image is a visual signal. Through professionally designed images, it is possible to develop a visual language for communication between people, and it can also be a historical material for understanding ethnic culture and historical origins. A large number of two-dimensional paintings, three-dimensional sculptures and buildings in the history of world art can also be regarded as the image cultural assets developed by human civilization from ancient times to the present.

Signal

In information theory, a signal is a flow of information. Most signals of interest can be expressed as functions of time or position. Any physical quantity that carries information can be used as a signal. The information carried by the signal itself is our purpose, to extract useful signals from it, and to suppress the signal processing goal of the interference part. A signal that is continuous in all dimensions is an analog signal, and a signal that is discrete in all dimensions is a mathematical signal. Mathematical signals are generated by discretizing analog signals in time and amplitude dimensions.

Digital Image Signal Representation and Classification
Binary image

The brightness value of each pixel in an image can only be taken from an image of 0 or 1, so it is also called a 1-bit image. A binarized image is an image that contains only two signal values, which are more used to represent the shape and contour of the image.

Grayscale image

Also known as a grayscale image. Each pixel in an image can be represented by a brightness value from 0 (black) to 255 (white). Between 0-255 represent different gray levels. When the color signals of a color map (such as RGB) are equal, such as R=G=B, the image shows a transitional behavior from black to white. Often used to represent the color depth of an image. similar to a sketch in a painting

color image

Color images are mainly divided into two types, RGB and CMYK. The RGB color image is composed of three different color components, one is red, one is green, and one is blue. The CMYK type image is composed of four color components: cyan C, magenta M, yellow Y, and black K. CMYK type images are mainly used in the printing industry. Full color means that the color of the object in the image is very similar to the color seen by the human naked eye. In black and white images, full color refers to the brightness of objects. However, because the chemical properties of media such as color dyes are different from those of the human eye, it is impossible to obtain absolute full color. In a false color image the color of the object and the color of the image are changed, which can occur in many places. For example, the color of a negative film can be called false color, because the color of the negative film is the complementary color of the object color. But false color is often used to represent parts of the electromagnetic spectrum that are not visible. Such as remote sensing and cosmic spectroscopy.

false color image
multispectral image
Stereoscopic image

Stereo images are a pair of images of an object taken from different angles. Usually, we can use stereo images to calculate the depth information of the image. Stereoscopic image is to add depth light and shadow information on the plane, the visual effect (make the image look) like the visual result of "stereoscopic" stereoscopic image, while three-dimensional image is information description.

3D image

A 3D image is composed of a set of stacked 2D images. Each image represents a two-dimensional surface of the object

Pixels of Image Properties

pixel information

Pixel is the basic unit of image display, translated as "pixel" in English, pix is ​​a common abbreviation of the English word picture, plus the English word "element" to get pixel, so "pixel" means "picture element", sometimes Also known as pel. Each such message element is not a point or a square, but an abstract sample.

View image representation from monitor

The most common representation of a display is xx inches, and the core parameter representing the display is PPI, which is pixels per inch, also known as pixel density, so points are not representations of images!

LDPI low pixel density about 120 pixels per inch (36 x 36 px)

MDPI medium pixel density about 160 pixels per inch (48 x 48 px)

HDPI high pixel density approximately 240 pixels per inch (72 x 72 px)

XHDPI very high pixel density about 320 pixels per inch (96 x 96 px)

XXHDPI super high pixel density approximately 480 pixels per inch (144 x 144 px)

bitmap

Bitmap (English: Bitmap, Taiwan called bitmap), also known as raster graphics (Raster graphics), is an image represented by an array of pixels.

The more bits of information used per pixel, the more colors are available, the more realistic the color representation, and the corresponding larger amount of data.

vector illustration

Vector graphics are images represented by geometric primitives based on mathematical equations such as points, lines, or polygons in computer graphics. All modern computer monitors convert vector graphics into a raster image format, and the raster image contains the value of each pixel on the screen and is stored in memory.

Color of image properties

human visual system

On the retina of the human eye, there are two types of photoreceptor cells: rod cells and cone cells

Rod cells: mainly function in low light conditions, and have no color recognition function, so we cannot distinguish colors in dim light conditions.

Cones: Function in bright conditions. Under normal circumstances, there are three kinds of cone cells on the retina of the human eye that can sense red (R), green (G) and blue (B). S mainly distinguishes short wave, mainly blue, M, medium wave, mainly green, and L Long wave, mainly red.

N-chromatologist – an animal or person with N cones

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-bVWJ52xZ-1678179718427) (C:\Users\86166\AppData\Roaming\Typora\typora-user-images\ image-20211220212202851.png)]

Pippi shrimps have 16 types of cone cells, and they can see 10 times more colors than humans, catching up with all animals on earth. They can see ultraviolet, infrared, and even polarized light.

Colors and Models
GRB model

Models that match the main human visual system, built in primary colors

[External link image transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the image and upload it directly (img-GP3nyyNI-1678179718428) (C:\Users\86166\AppData\Roaming\Typora\typora-user-images\ image-20211220213311152.png)]

HSV model

HSV is an abbreviation for Hue, Saturation, and Value. This model describes colors through these three characteristics.

HSV (Hue Saturation Value), HSI (Hue Saturation Intensity), and HSL (Hue Saturation Lightness) are the three most common cylindrical coordinate representations of points in the RGB color model.

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-alHkuODd-1678179718429) (C:\Users\86166\AppData\Roaming\Typora\typora-user-images\ image-20211220213558708.png)]

Lab model

It is a color model for equipment without light, and it is a color model based on physiological characteristics

The L component in the Lab color space is used to represent the brightness of the pixel, and the value range is [0,100], which means from pure black to pure white; a represents the range from red to green, and the value range is [127,-128]; b represents the range from yellow to blue, and the value range is [127,-128].

The RGB color space cannot be directly converted to the Lab color space. It is necessary to use the XYZ color space to convert the RGB color space to the XYZ color space, and then convert the XYZ color to the Lab color space.

[External link image transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the image and upload it directly (img-daxY8o6B-1678179718429) (C:\Users\86166\AppData\Roaming\Typora\typora-user-images\ image-20211220214131250.png)]

YUV model

The YUV model is the color coding method used by the TV signal system. These three variables represent the brightness of the pixel (Y) and the signal difference between the red component and brightness (U) and the difference between blue and brightness (V)

Black and white video only has Y (Luma, Luminance) video, which is the grayscale value. When the color TV specification is specified, color TV images are processed in the format of YUV/YIQ, and UV is regarded as C (Chrominance or Chroma) representing chroma. If the C signal is ignored, the remaining Y (Luma) signal is It is the same as the previous black-and-white TV number, which solves the compatibility problem between color TV and black-and-white TV. The biggest advantage of Y'UV is that it only takes up very little bandwidth.
insert image description here

[External link image transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the image and upload it directly (img-FoedbsmQ-1678179718430) (C:\Users\86166\AppData\Roaming\Typora\typora-user-images\ image-20211220214620618.png)]

Gray model

The GRAY model is not a color model, it is a grayscale image model, and its name uses the uppercase letters of the English word gray.

The grayscale image has only a single channel, and the grayscale value is represented from black to white in sequence from 0 to maximum according to the number of image digits. For example, in 8UC1 format, it is quantized into 256 levels from black to white, expressed by 0-255, of which 255 means white

Gray = R* 0.299 + G * 0.587 + B * 0.114

In fact, this formula is the Y algorithm in YUV.

CMYK model

Printing four-color separation mode (CMYK) is a color registration mode used in color printing. It uses the principle of color mixing of three primary colors of color materials, plus black ink, and a total of four colors are mixed and superimposed to form the so-called full-color printing.

C:Cyan= cyan, often mistakenly called "sky blue" or "blue"

M: Magenta = magenta, also known as "magenta"

Y: Yellow = yellow

K: black = black, (to avoid confusion with B in RGB, so use K)

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-51xqNirG-1678179718431) (C:\Users\86166\AppData\Roaming\Typora\typora-user-images\ image-20211220215745534.png)]

Other professional terms:

Shade: The color brought by the spectrum actually refers to the RGB model

Color material: the color used for painting, generally refers to the color in the CMYK model

Hue: Combine lightness and chroma into hue. As shown in the picture below: [Red] is the hue, and [Fresh, Light, Pink] is the hue

Detailed image format

Picture format comparison

[External link image transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the image and upload it directly (img-fCG8Ee1V-1678179718431) (C:\Users\86166\AppData\Roaming\Typora\typora-user-images\ image-20211220224330898.png)]

Common picture format bmp

BMP is (Windows Bitmap) A Windows Bitmap can store a single raster image in any color depth (from black and white to 24-bit color). The Windows bitmap file format is compatible with other Microsoft Windows programs. It does not support file compression, nor is it suitable for web pages. Overall, the disadvantages of the Windows bitmap file format outweigh its advantages. To ensure the quality of photo images, please use PNG, JPEG, TIFF files. BMP files are suitable for Windows wallpapers.

Pros: BMP supports 1-bit to 24-bit color depth.

The BMP format is widely compatible with existing Windows programs, especially older ones

Disadvantages: BMP does not support compression, which will result in very large files.

Common picture format JPEG (.jpg .jpeg)

It is a lossy compression format, which can compress the image in a small storage space. Repeated or unimportant data in the image will be lost, so it is easy to cause damage to the image data. In particular, using an excessively high compression ratio will significantly reduce the quality of the recovered image after final decompression. If you pursue high-quality images, you should not use an excessively high compression ratio.

advantage:

Photographic or realistic works support advanced compression.

File size can be controlled with a variable compression ratio.

Interlacing is supported (corresponding to progressive JPEG files).

JPEG is widely supported as an Internet standard.

shortcoming:

Lossy compression degrades the quality of the original image data.

When you edit and resave a JPEG file, JPEG mixes the quality of the original picture data with degradation. This decline is cumulative.

JPEG is not suitable for simpler pictures that contain few colors, have large areas of similar color, or have significant differences in brightness.

png format

Portable Network Graphics (abbreviated as PNG in foreign language, full name in foreign language: Portable Network Graphics), is the latest image file format accepted on the Internet. PNG can provide lossless compressed image files that are 30% smaller in length than GIF. It also provides 24-bit and 48-bit true color image support and many other technical supports.

priority:

PNG supports a high level of lossless compression.

PNG supports alpha channel transparency.

PNG supports gamma correction.

PNG supports interlacing.

PNG is supported by the latest web browsers.

shortcoming:

Older browsers and programs may not support PNG support.

As an Internal file format, PNG offers less compression than JPEG's lossy compression.

As an internal file format, PNG does not provide any support for multi-image files or animation files. GIF format supports multiple image files and animation support.

Mat class and basic functions

Open source framework learning method and overall analysis of Mat class

If you learn the open source framework

Overall :

(1) What is the role of the open source framework?

1. In what areas can it be applied
? 2. What is the overall structure?

(2) How to use the open source framework

1. Public cases of using open source frameworks
2. Take OpenCV as an example:
what classes are there
and what methods are there

(3) Understand the design logic of open source frameworks

1. Take OpenCV as an example,
what modules are there, what are the functions of each module, and how are the modules connected together?
2. Look at the source code for analysis

Details (take the Mat class as an example)

(1) Analysis from the perspective of modules

Mat->matrix matrix

[1,1]

Container -> container for storing data

​ Store image information

​ Image:

​ Binary map

​ Color picture

​ Data graph

(2) Analysis from the meaning of language

kind:

​ Attributes:

​ row, column, data, dimension

​ Types of images

​ Method:

​ Matrix operation method, method of getting/setting attributes

​ Different kinds of image settings

​ Constructor/Destructor········

​ Memory management method:

​ (1) No memory management, purely by the system

​ (2) Manual management, purely relying on programmers to design code

​ (3) GC management, garbage collection, timing or interval, reclaim unnecessary space

​ (4) Reference counting management, RC, if there is a new reference, the reference count is +1, and if the old reference disappears, the reference count is -1. When the reference count drops to 0, the memory will be recovered, which is the way OpenCV objects take

(5) Memory pool management. Similar to fishbowl management, this method is generally used as an auxiliary management method or an overall management method.

How to see the source code

(1) For c language

Function is the module division method of C language, and the key point is to analyze the implementation method of each function

(2) For C++

module

kind

attribute member variable

Method function:

(1) attributes

(2) method

(3) Memory management method

Memory and methods of Mat class

#include <opencv2/opencv.hpp>
using namespace cv;

#include <iostream>
using namespace std;
void printMat(Mat &mat) {
    cout << "Mat:" << mat << endl;
    cout << "flags:" << mat.flags << endl;
    cout << "dims:" << mat.dims << endl;
    cout << "rows:" << mat.rows << "cols:" << mat.cols << endl;
    if (mat.data)
    {
        cout << "data:存在" << endl;
    }
    
    cout << "存储相关:" << endl;
    cout << "umatadata:" << mat.u << endl;
    if (mat.u->refcount){
        cout << "refcount:" << mat.u->refcount << endl;
    }
    
}

int main(int argc, char const *argv[])
{
    //仅创建对应的矩阵头信息,没有包含真正的矩阵内部数据
    Mat m;

    //CV_8UC1 8指数据位数,U指是否带符号,Cchannel信道数,1一个信道
    Mat m_size(10, 10, CV_8UC1);
    //zeros进行元素数据清零操作
    m_size.zeros(10, 10, CV_8UC1);
    imshow("test", m_size);
    printMat(m_size);

    m = m_size;
    printMat(m);

    Mat m1 = m;
    printMat(m1);

    //上述两种方式,都是浅拷贝,只会增加对应的引用计数(refcount),而不会产生新的内存
    Mat mat_clone = m.clone();
    Mat mat_copy_to_mat;
    m.copyTo(mat_copy_to_mat);

    printMat(mat_clone);
    printMat(mat_copy_to_mat);
    //上述两种方式,是深拷贝,会产生新的内存,对应的引用计数不会增加


    Mat scalar_mat(100, 100, CV_8UC3, Scalar(128, 255, 0));
    imshow("scalar", scalar_mat);

    //获取对角线上的数据
    cout << scalar_mat.diag(0) << endl;

    //create 办法,进行数据的填充处理了,create之前的对象就被销毁了
    Mat un_create;
    printMat(un_create);

    un_create.create(10, 10, CV_8UC1);
    printMat(un_create);
    
    //waitKey(0);
    //Mat作用:
    //(1)用于数学上的矩阵运算(2)进行图像数据的存储和相关的运算操作
    return 0;
}

Basic method of image processing

#include <opencv2/opencv.hpp>
using namespace cv;
#include <iostream>
using namespace std;

int trackbarvalue;
Mat  image;


void trackbarcallback(int value, void *data) {
    cout << value << endl;

    image &= 1;
    image *= (value / 255.0);

    imshow("window_name", image);
}

void mouseEvenCallBack(int event, int x, int y, int falg, void *userdata) {
    cout << event << endl;
    cout << x << ":" << y << endl;
}

int main(int argc, char const *argv[])
{
    /*
    图像存取相关函数
    */
   //(1)图片的绝对路径或相对路径(2)读入图片到Mat容器当中的存取方式
   Mat srcImage = imread("../spand.jpg", IMREAD_GRAYSCALE);
    image = srcImage;
    //autosize 在部分环境下,可能无法改变窗口的大小 normal可以改变
    namedWindow("window_name", WINDOW_NORMAL);

    //添加进度条,注意使用回调函数
    createTrackbar("trackbar", "window_name", &trackbarvalue, 255, trackbarcallback);

    //鼠标的操作
    setMouseCallback("window_name", mouseEvenCallBack, (void *)&srcImage);

    //(1)显示的图片名称(2)图片的容器
    imshow("window_name", srcImage);

    /*
        1、保存的图片名称,注意需要带后缀名
        2、保存的源图片容器
        3、存储过程中的编码处理 比如压缩处理
    */
    vector<int> comparession;
    comparession.push_back(IMWRITE_PNG_COMPRESSION);
    comparession.push_back(9);
    imwrite("gray_logo1.jpg", srcImage, comparession);
    
    //键盘组操作  等待一个任意字符,参数为延迟时间
    waitKey(0);
    return 0;
}

drawing method

#include <opencv2/opencv.hpp>
using namespace cv;
#include <iostream>
using namespace std;

int main(int argc, char const *argv[])
{
    /*
        1、线Line
        2、矩形rectangle
        3、圆circle
        4、椭圆ellipse
        5、多边形 poly
    */

   /*
        1、Point 点x,y
        2、Size 尺寸 width height
        3、Rect 矩形 x,y,width,height
        4、Scalar 颜色
   */

    Mat m(600, 400, CV_8UC4);
    m.zeros(600, 400, CV_8UC4);
    
    /*
        Scalar 颜色对象,可以填写对应的颜色
        thickness 线的宽度  对于包围图形,-1代表填充内部空间
        Linettype 线的类型
    */
    line(m, Point(100, 100), Point(300, 500), Scalar(0, 0, 255, 128), 5, -1);

    rectangle(m, Point(100, 100), Point(300, 500), Scalar(0, 255, 0, 128), -1, LINE_4);

    circle(m, Point(100, 100), 50, Scalar(255, 0, 0, 128), -1, LINE_4);

    /*
        cvtColor
    */
   cvtColor(m, m, COLOR_BGR2BGRA, 4);

   /*
        如果是单信道channel,单独数值
        如果是多信道呢?
   */
  uchar signal_channel = m.at<uchar>(100, 100);
  Vec2b double_channel = m.at<Vec2d>(100, 100);
    
    imshow("result", m);

    waitKey(0);
    return 0;
}

graphics processing technology

Image Grayscale Transformation Technology

There are two main applications of grayscale transformation technology

(1) Extract key information for image contour processing

(2) Optimize the details to improve the effect of the image

Common Grayscale Transformation Techniques

(1) Thresholding processing

(2) Histogram information processing

(3) Grayscale transformation function

​ Linear Algebra

​ logarithmic transformation

​ Gamma correction

​Comprehensive use (contrast stretching technology grayscale layering bit plane layering)

(4) Distance transformation

Thresholding

Thresholding is image processing binarization. It is the simplest method of image segmentation. Binarization converts a grayscale image into a binary image. Set the grayscale of pixels greater than a certain critical grayscale value as the maximum grayscale value, and set the grayscale of pixels smaller than this value as the minimum grayscale value, thereby realizing binarization.

Commonly used thresholding:

OTSU Thresholding

fixed thresholding

adaptive thresholding

double thresholding

half-thresholding

OTSU Thresholding

The OTSU algorithm is also called the maximum inter-class difference method, sometimes called the Otsu algorithm. It was proposed by Otsu in 1979 and is considered to be the best algorithm for threshold selection in image segmentation.

Its basic idea is to use a threshold to divide the data in the image into two categories. The grayscale of the pixels in the image in one category is less than the threshold, and the grayscale of the pixels in the image in the other category is greater than or equal to the threshold. Then the image can be divided into foreground and background by using the threshold. In general, extracting the foreground can get the image outline we want.

#include <opencv2/opencv.hpp>
using namespace cv;

#include <iostream>
using namespace std;
//阈值计算
int my_otsu(Mat inputImg) {
    //初始化
    int rows = inputImg.rows;
    int cols = inputImg.cols;
    int sumPixel[256] = {0};
    float proDis[256] = {0};
    int result_threshold;
    //拿到灰度值的统计信息,统计一张图中的各个像素的出现的灰度值次数,比如在点(2,3)处的灰度值为100,而(20,20)处的灰度值也是100,那么就统计100出现两次
    for (int i = 0; i < rows; i++) {
        for (int j = 0; j < cols; j++) {
            sumPixel[(int)inputImg.at<uchar>(i, j)]++;
            //cout << i << ":" << j << "->" << (int)inputImg.at<uchar>(i, j) << endl;
        }
    }
    //计算概率发布,某灰度值出现的次数占所有图所有像素的比例
    for (int i = 0; i < 256; i++) {
        proDis[i] = sumPixel[i] / (float)(rows * cols);
    }
    //计算最大方差
    float all_left, all_right, avg_left, avg_right, temp_left, temp_right, temp_delta;
    float max_delta = 0.0;
    for (int i = 0; i < 255; i++) {
        all_left = all_right = avg_left = avg_right = temp_left = temp_right = temp_delta = 0;
        for (int j = 0; j < 255; j++) {//把所有的灰度值分为左右两部分
            if (j <= i) {
                all_left += proDis[j];
                temp_left += j * proDis[j];
            } else {
                all_right += proDis[j];
                temp_right += j * proDis[j];
            }
        }
        //通过求出的左右两部分的所有占比然后就求平均值
        avg_left = temp_left / all_left;
        avg_right = temp_right / all_right;
        //求方差
        temp_delta = (float)(all_left * all_right * pow((avg_left - avg_right), 2));
        if (temp_delta > max_delta) {
            max_delta = temp_delta;
            result_threshold = i;
        }
    }

    //计算结果
    return result_threshold;
}

int main(int argc, char const *argv[])
{
    //读入图片
    Mat srcImg = imread("A:/OPENC/threshold/R-C.jpg");
    //转换为灰度图
    Mat grayImg;
    cvtColor(srcImg, grayImg, COLOR_RGB2GRAY);
    // imshow("src", srcImg);
    // imshow("gray", grayImg);
    //进行阈值计算
    int otsu = my_otsu(grayImg);
    cout << otsu << endl;
    //通过阈值进行二值化
    Mat result = grayImg.clone();
	
    //OTSU阈值化处理
    for (int i = 0; i < grayImg.rows; i++) {
        for (int j = 0; j < grayImg.cols; j++) {
            if (grayImg.at<uchar>(i , j) >= otsu) {
                result.at<uchar>(i, j) = 255;
            } else {
                result.at<uchar>(i, j) = 0;
            }
        }
    }
    imshow("result", result);
    //

    waitKey(0);
    return 0;
}

fixed thresholding

Threshold calculation function and method

#include <opencv2/opencv.hpp>
using namespace cv;

#include <iostream>
using namespace std;
int main(int argc, char const *argv[])
{
    Mat srcImg = imread("A:/OPENC/threshold/R-C.jpg", IMREAD_GRAYSCALE);
    //根据填写阈值进行处理,根据输入的type类型阈值处理
    Mat resultImg;
    threshold(srcImg, resultImg, 138, 255, THRESH_BINARY);
    // THRESH_BINARY     = 0 二值化,超过阈值,保留为白色
    // THRESH_BINARY_INV = 1 反二值化,与上面的方式相反
    // THRESH_TRUNC      = 2 超过阈值的部分,保留到阈值
    // THRESH_TOZERO     = 3 不足阈值的部分,清0,超过阈值的部分,保留
    // THRESH_TOZERO_INV = 4 不足阈值保留,超过,清0
    //THRESH_MASK       = 7  一般用在抠图或则截取信息
    //THRESH_OTSU       = 8 是否使用OTSU算法
    //THRESH_TRIANGLE   = 16 
    imshow("src", srcImg);
    imshow("gray", resultImg);
    waitKey(0);
    
    return 0;
}

adaptive thresholding
Mat adapImg;
    /*
        ADAPTIVE_THRESH_MEAN_C 平均计算
        ADAPTIVE_THRESH_GAUSSIAN_C 高斯算法, 计算当前值距离,通过高斯方程拿到结果
    */
    adaptiveThreshold(srcImg, adapImg, 255, ADAPTIVE_THRESH_MEAN_C, THRESH_BINARY, 5, 2);
double thresholding
half-thresholding
/*
    固定阈值,进行全局阈值处理,针对亮暗区分明显的使用办法
    自适应阈值,前后区分不明显,获取轮廓的方式
    双阈值化,通过大小阈值的两次操作,找到图片中的关键信息
    半阈值化,主要用于拿到图片中的诸如文字特征明显的信息
*/

Histogram information processing

In statistics, a histogram is a graphic representation of data distribution, and it is a two-dimensional statistical chart. Its two coordinates are the statistical sample and the measurement of a certain attribute corresponding to the sample. specific form of expression. Because the length and width of the histogram are very suitable for expressing quantitative changes, it is easier to interpret values ​​with small differences

The histogram is a diagram that can be used to understand the grayscale distribution of the entire image as a whole. Through the histogram, we can have an intuitive understanding of the contrast, brightness, and grayscale distribution of the image.

Realize the histogram display image pixels:

#include <opencv2/opencv.hpp>
using namespace cv;
#include <iostream>
using namespace std;

int main(int argc, char const *argv[])
{
    //灰度直方图 H-S直方图 RGB直方图
    Mat srcImg = imread("A:/OPENC/histogram/zhanyangsong.jpg");

    Mat grayImg;
    cvtColor(srcImg, grayImg, COLOR_RGB2GRAY);
    //计算直方图信息
    /*
        CV_EXPORTS void calcHist( const Mat* images, int nimages,
                          const int* channels, InputArray mask,
                          OutputArray hist, int dims, const int* histSize,
                          const float** ranges, bool uniform = true, bool accumulate = false );
        images 输入的图片组:图片需要拥有相同的大小,相同的颜色深度
        nimages 图像的个数
        channels 需要计算的直方图的通道个数
        mask 可选的掩码,一般不使用时设置为空。掩码图片必须和输入的图片组大小相同。
        hist 输出的直方图信息
        dima 直方图的维数
        histSize 直方图维度大小
        ranges 直方图统计的范围
        uniform 是否进行归一化处理
        accumulate 累计操作,默认不需要
    */
    int channels[1] = {0};
    Mat hist;
    int histSize[1] = {256};
    float hrange[2] = {0, 255};
    const float *ranges[1] = {hrange};
    calcHist(&grayImg, 1, channels, Mat(), hist, 1, histSize, ranges);
    
    //绘制直方图
    Mat histOutputImg(256, 256, CV_8U, Scalar(255));
    double maxValue;
    double minValue;
    minMaxLoc(hist, &minValue, &maxValue);
    int hpt = 0.9 * 256;
    for (int i = 0; i < 256; i++) {
        float binVal = hist.at<float>(i);
        int temp = (binVal * hpt / maxValue); 
        line(histOutputImg, Point(i, 256), Point(i, 256 - temp), Scalar::all(0));
    }

    imshow("srcImg", srcImg);
    imshow("grap", grayImg);
    imshow("result", histOutputImg);
    waitKey(0);
    return 0;
}

Common histogram operations:

Histogram equalization

Histogram matching

Histogram comparison

Histogram lookup

Histogram Cumulative

#include <opencv2/opencv.hpp>
using namespace cv;
#include <iostream>
#include <vector>
using namespace std;

Mat histOutputImg(Mat hist) {
    Mat histOutputImg(256, 256, CV_8U, Scalar(255));
    double maxValue;
    double minValue;
    minMaxLoc(hist, &minValue, &maxValue);
    int hpt = 0.9 * 256;
    for (int i = 0; i < 256; i++) {
        float binVal = hist.at<float>(i);
        int temp = (binVal * hpt / maxValue); 
        line(histOutputImg, Point(i, 256), Point(i, 256 - temp), Scalar::all(0));
    }
    return histOutputImg;
}



int main(int argc, char const *argv[])
{
    //灰度直方图 H-S直方图 RGB直方图
    Mat srcImg = imread("A:/OPENC/histogram/1.jpg");

    Mat grayImg;
    cvtColor(srcImg, grayImg, COLOR_RGB2GRAY);
    //计算直方图信息
    /*
        CV_EXPORTS void calcHist( const Mat* images, int nimages,
                          const int* channels, InputArray mask,
                          OutputArray hist, int dims, const int* histSize,
                          const float** ranges, bool uniform = true, bool accumulate = false );
        images 输入的图片组:图片需要拥有相同的大小,相同的颜色深度
        nimages 图像的个数
        channels 需要计算的直方图的通道个数
        mask 可选的掩码,一般不使用时设置为空。掩码图片必须和输入的图片组大小相同。
        hist 输出的直方图信息
        dima 直方图的维数
        histSize 直方图维度大小
        ranges 直方图统计的范围
        uniform 是否进行归一化处理
        accumulate 累计操作,默认不需要
    */
    int channels[1] = {0};
    Mat hist;
    int histSize[1] = {256};
    float hrange[2] = {0, 255};
    const float *ranges[1] = {hrange};
    calcHist(&grayImg, 1, channels, Mat(), hist, 1, histSize, ranges);


    //绘制直方图
    

    //直方图均衡化 将过亮或过暗的图片通过均衡化,细节暴露出来
    Mat equalizeOutImg;
    equalizeHist(grayImg, equalizeOutImg);
    Mat outHist;
    calcHist(&grayImg, 1, channels, Mat(), outHist, 1, histSize, ranges);
    //彩色图均衡化处理
    Mat colorImg;
    vector<Mat> BRG_channels;
    split(srcImg, BRG_channels);
    for (unsigned long i = 0; i < BRG_channels.size(); i++) {
        equalizeHist(BRG_channels[i], BRG_channels[i]);
    }
    merge(BRG_channels, colorImg);

    //直方图匹配:使两张匹配的图片的像素融合
    Mat newSrcImg = imread("A:/OPENC/histogram/zhanyangsong.jpg");
    Mat newGrayImg;
    cvtColor(newSrcImg, newGrayImg, COLOR_RGB2GRAY);
    Mat newHist;
    calcHist(&newGrayImg, 1, channels, Mat(), newHist, 1, histSize, ranges);
    //计算图片的累计概率
    float histOld[256] = {hist.at<float>(0)};
    float histNew[256] = {newHist.at<float>(0)};
    for (int i = 0; i < 256; i++) {
        histOld[i] = histOld[i - 1] + hist.at<float>(i);
        histNew[i] = histNew[i - 1] + newHist.at<float>(i);
    }
    //构建累计概率误差概率
    float diff[256][256];
    for (int i = 0; i < 256; i++) {
        for (int j = 0; j < 256; j++) {
            diff[i][j] = fabs(histOld[i] - histNew[j]);
        }
    }
    //生成LUT(lookuptable)表
    Mat Lut(1, 256, CV_8U);
    for (int i = 0; i < 256; i++) {
        float min = diff[i][0];
        int index = 0;
        for (int j = 0; j < 256; j++) {
            if (min > diff[i][j]) {
                min = diff[i][j];
                index = j;
            }
        }
        
        Lut.at<uchar>(i) = (uchar)index;
    }
    Mat resultOutImg, histOut;
    LUT(grayImg, Lut, resultOutImg);
    calcHist(&resultOutImg, 1, channels, Mat(), histOut, 1, histSize, ranges);

    //直方图对比
    for (int i = 0; i < 6; i++) {
        cout << compareHist(hist, newHist, i) << endl; 
    }

    //imshow("srcImg", srcImg);
    // imshow("grap", grayImg);
    // imshow("histImg", histOutputImg(hist));
    // imshow("result", histOutputImg(hist));
    // imshow("equalineOutImg", equalizeOutImg);
    //imshow("histImg", histOutputImg(outHist));
    //imshow("ColorImg", colorImg);
    // imshow("newSrcImg", newGrayImg);
    // imshow("newHist", histOutputImg(newHist));
    // imshow("outImg", resultOutImg);
    // imshow("histout", histOutputImg(histOut));
    waitKey(0);
    return 0;
}

Gray scale processing function and its application

linear transformation

The brightness and contrast of the entire image can be adjusted by changing the value of a single pixel by doing a linear transformation, that is, performing a linear transformation on the pixel, let r be the gray level before the transformation, and s be the gray level after the transformation, then the function of the linear transformation:

S = ar + b

Among them, a is the slope of the line, and b is the intercept on the y-axis. Choosing different a and b values ​​will have different effects:

a > 1, increase the contrast of the image

a < 1, reduce the contrast of the image

a = 0, b != 0, the image is brightened or darkened

a < 0 and b = 0, the bright areas of the image are darkened, and the dark areas are brightened

a = -1, b = 255, image brightness inversion

Project actual combat (projects are placed in resources)

Guess you like

Origin blog.csdn.net/qq_58360406/article/details/129386236