Bulldozer distance (Wasserstein distance) and several other commonly used distribution difference measurement methods (mark)



1. Wasserstein distance

1.1 Method Introduction
Wasserstein Distance is also known as Earth Mover's distance (EMD). The definition of Wasserstein Distance is to evaluate the minimum cost (minimum of the average moving distance) required to convert from P distribution to Q distribution → and dig the east wall Filling the west wall is similar (the minimum work required to convert one shape into another shape), similar to digging out one place of soil, and then filling another place, and the W distance is looking for every time dug in this process. A square of soil needs to consume the minimum amount of energy, so the Wasserstein Distance is often found to be called the bulldozer distance.

Which bulldozer is stronger?

1.2 Method advantages
Although KL divergence and JS divergence are more widely used, the advantage of Wessertein distance over KL divergence and JS divergence is that even if the support sets of the two distributions do not overlap or overlap very little, they can still reflect the two distributions. Distributed far and near. While the JS divergence is constant in this case, the KL divergence may be meaningless.
The values ​​of KL divergence and JS divergence are abrupt, either maximum or minimum, but Wasserstein distance is smooth. If we want to optimize with gradient descent

Guess you like

Origin blog.csdn.net/u013537270/article/details/125915972