Flow-based Generative Model for Generating Networks

When I read the key point paper recently, I found that the Flow-based generation network can be used to simulate and generate the real potential error probability distribution, thereby increasing the acquisition of Regression-based information and greatly improving the AP of the Regression-based method. In order to really understand this paper, I learned about Flow-based Generative Model in detail, and recorded it here. If there is something wrong, please give me advice.

Generator

First of all, we need to know that the Flow-based Generative Model is essentially a generator, and the generator G defines a probability density function PG P_GPG, all we have to do is let PG P_GPGInfinitely close to P data P_{data}PdataThat is, the true sample distribution. How to approach it? Here you need to understand the concept of maximum likelihood estimation.
insert image description here

The maximum likelihood estimation roughly means that we know a probability density function, and the internal parameters cannot be determined, but we can conduct a large number of experiments, substitute the experimental results into the function, and find the parameter with the highest probability, that is, determine the probability density function.

PG P_G herePGis a probability density function of a random parameter, we start from P data P_{data}PdataDraw samples xix^i from the true sample distributionxi , take it intoPG ( xi ) P_G(x^i)PG(xi),并max( l o g P G ( x i ) logP_G(x^i) logPG(xi )), through continuous iteration can makePG P_GPGGetting closer to P data P_{data}Pdata. It turns out that this is probably the case. The following introduces the mathematical knowledge needed for the Flow-based Generative Model.

Math Background

1. Jacobian matrix
insert image description here First, you need to understand the Jacobian matrix, known x = f ( z ) , fx=f(z),fx=f(z),f is a function of z, thenffThe Jacobian matrix of f is J f J_fJf

2. Determinant
insert image description here
insert image description here
The physical meaning of determinant can be understood as the area or volume of a polygon with rows as vectors.

insert image description hereWe know two probability density functions π ( z ) , p ( x ) \pi(z),p(x)π ( z ) ,p ( x ) , these two probability density functions existx = f ( z ) x=f(z)x=f ( z ) such a changing relationship. According to the fact that the areas of the integrals of the corresponding parts of the two functions are equal, that is, the area of ​​the purple square in the above figure is equal to the area of ​​the rhombus on the right (although it does not look equal), the equation in the figure above can be obtained. The equation can be understood in this way, inΔ z , Δ x \Delta{z},\Delta{x}Δz,Δ x interval toπ ( z ) , p ( x ) \pi(z),p(x)π ( z ) ,p ( x ) is integrated, since x passes throughx = f ( z ) x=f(z)x=f ( z ) can be mapped to z, so the area of ​​the integral (here should be understood as the volume) can be guaranteed to remain unchanged, and it becomes onlyΔ z , Δ x \Delta{z},\Delta{x}Δz,The area of ​​the Δ x interval andπ ( z ) , p ( x ) \pi(z),p(x)π ( z ) ,The density of p ( x ) . And becauseΔ z , Δ x \Delta{z},\Delta{x}Δz,Δ x interval is small enough, inΔ z , Δ x \Delta{z},\Delta{x}Δz,The integral on the Δ x interval can be directly equivalent toΔ z , Δ x \Delta{z},\Delta{x}Δz,The area of ​​the Δ x interval is multiplied by the density, and the area of ​​the rhombus on the right can be expressed by the absolute value of the determinant, so the equation in the above figure can be obtained.
insert image description hereThrough a series of equivalent changes of the determinant in the above figure, we can finally obtain the equation in the yellow box in the above figure. That is, it is known thatx = f ( z ) x=f(z)x=f ( z )π ( z ) , p ( x ) \pi(z),p(x)π ( z ) ,p ( x ) can be calculated according toffThe absolute values ​​of the determinants of the Jacobians of f map to each other.

Flow-based Generative

insert image description hereIn the Flow-based model, we can use the neural network as G, xix^ixi is a sample from the real distribution. When G is reversible, the inverse of G and the Jacobian matrix can be obtained to obtainp G ( xi ) p_G{(x^i)}pG(xi ). Therefore, we train the inverse of G and use G for inference. The figure below is a demo example of the Flow-based Generative application.
insert image description here

Guess you like

Origin blog.csdn.net/litt1e/article/details/126498474