This note summarizes the consolidation in accordance with Chapter 5 PaddlePaddlePPT content.
- Outline
1.1 target image recognition
1.2 image recognition challenge
l semantic gap (Semantic Gap) phenomenon: the visual characteristics of the image between the bottom and top gap semantic concept
Basic frame image recognition 1.3
Space category feature space measuring space
- Conventional image recognition technology
2.1 Early image recognition technique (1990-2003)
2.1.1 Feature Extraction
l global feature extraction: an image represented by statistics of the visual characteristics of the underlying globally
Images are represented as vectors: the original picture vector space mapping vector representation
Example l global features: color, texture feature, shape feature
Feature Transform l: represents the performance improvement feature
l manifold learning (Manifold Learning): high-dimensional data mapping represented as a vector in a low dimensional space
l transform simple features: centering, normalization, decorrelation, whitening
2.1.2 Indexing
2.1.3 Relevance Feedback
2.1.4 reordering
2.2 interim image recognition technique (2003-2012)
2.2.1 Feature Extraction
l local features (Local Feature): image block (Patch) vector
l wherein the detector (Feature Detector): detecting the center position of the image block (interest points)
l characteristic descriptors (Feature Desciptor): visual content description block
l local detecting sub-: Harris, DoG, SURF, Harris-Affine, Hessian-Affine, MSER
l local descriptors: SIFT, PCA-SIFT, GLOH, Shape Context, ORB, COGE
2.2.2 vectorization
l local features into visual words (i.e. quantized feature, Feature Quantization): Find visual words, local feature vectors for the transformation Image ID
l quantization techniques common features: Hierarchical 1-NN, KD-tree
l represents an image based on visual words:
Local image feature visual words visual word histogram bags →
2.2.3 Indexing
l inverted index
l Sort: tf-IDF weighting (Term frequency-inverse document frequency)
2.2.4 treatment
Query Expansion l: containing more query terms, so that the original local features, then expanded query
Other post-processing techniques l: local geometric verification (Local Geometric Verification), the product of the quantization (Product Quantization)
- Depth learning and image recognition
3.1 depth learning development process
Applications l depth study in the field of an image: image search, identify abnormal tumors, image description, the colored image
3.2 Why use deep learning
l The human brain visual mechanism: 1) depending on the stage of feeling - Information Collection 2) stages of visual perception - Cognitive Information
Neural l - center - the brain: intake original signal (pixel) - preliminary processing (the edge direction) - Abstract (Shape) - Abstract more (specifically objects)
3.3 How to use deep learning
3.3.1 How to use image recognition depth learning solutions
It is using machine learning (learning depth) of the desired l: to find a suitable function
3.3.2 Use these steps: modeling (people), loss of function (people), parameter learning (machine)
3.3.3 model
Common l activation function: Sigmoid, TanH, ArcTan, ReLU, PReLU
l Pre neural network: an input layer, a hidden layer → → output layer
Examples l Model: AlexNet, VGG, GoogleNet, Residual Net
output layer l: softmax activation function as a function of the output layer, easy to understand better calculated
l provided suitable network structure: the number of layers, the number of nodes, the activation function
3.3.4 loss function
Common loss function l: quadratic loss function, cross entropy loss function
l total loss:
3.3.5 parameter learning
l gradient descent:
l backpropagation: chain rule
- Practice Course
l Face Recognition