(1) Sample questions
Given two words, word1 and word2, please calculate the minimum number of operations used to convert word1 into word2.
You can perform the following three operations on a word:
• Insert a character
• Delete a character
• Replace a character
【Example】
输入:word1 = "benyam", word2 = "ephrem"
输出:5
(2) Solving steps
1,Confirm status
- 找出最后一步
- 化成子问题
- 画出动态规划表
- 表的最后一格表示原问题,
- 表的任意一格表示一个子问题
- 填写动态规划表
- 理解整个动态规划的过程
[Example]
The last step: find the shortest edit distance from benyam to ephrem
Turn into a sub-problem:
Analysis:
1) The current problem is obtained from the sub-problem after insert, delete, and replace operations.
2) Deleting a character in word1 and inserting a character in word2 are equal. price.
Similarly, deleting a character in word2 and inserting a character in word1 are equivalent;
3) Replacing a character in word1 and replacing a character in word2 are equivalent.
In this way, there are actually only three essentially different operations:
• Insert: Insert a character in word2;
• Delete: insert a character in word1;
• replace: modify a character of word1.
Draw the dynamic programming table:
①Indicates the original problem: benyam → the shortest edit distance of ephrem
②Indicates the sub-problem: b → the shortest edit distance of ep
Fill in the dynamic programming Table:
Fill in the initial conditions:
(1) The distances from “”, e, ep, eph, ephr, ephre, ephrem to “” are 0, 1, 2, 3, 4, 5, 6
(2) The distances from "" to "", b, be, ben, beny, benya and benyam are 0, 1, 2, respectively. 3, 4, 5, 6
(3) For ③, the current problem "the shortest edit distance of b → e" can be derived from its previous sub-problem through Insert, Delete or Replace: < /span>
- When selecting Insert, the sub-problem is b → "", the sub-problem distance is 1, b → e requires a replacement, so the current problem distance is 1+1=2
- When Delete is selected, the sub-problem is "" → e, the sub-problem distance is 1, b → e requires a replacement, so the current problem distance is 1+1=2
- When selecting Replace, the sub-problem is "" → "", the sub-problem distance is 0, b → e requires one replacement, so the current problem distance is 0+1=1
4) For ④, the current problem is "the shortest edit distance of be → e":
- When selecting Insert, the sub-problem is be → "", the sub-problem distance is 2, be → e requires a delete operation, so the distance is 2+1=3
- When selecting Delete, the sub-problem is b → e, and the sub-problem distance is 1. be → e requires a delete operation, so the distance is 1+1=2
- When Replace is selected, the sub-problem is b → "", the sub-problem distance is 1, be → e has no impact on the result when its sub-problem b → "", so the distance is 1+0=1
2. Transfer equation
• Create array storage dynamic programming table
// DP 数组
int n = word1.length();
int m = word2.length();
int [][] D = new int[n + 1][m + 1];
- Array size is often 1 larger than the input
- Clear the meaning of each element in the array
e.g. D[2,1] represents the shortest edit distance of be → e - Establish transfer equation according to the process of solving DP Table
if (word1[i] == word2[j])
D[i][j] = D[i - 1][j - 1];
else
D[i][j] = min{ D[i-1][j-1], D[i-1][j], D[i][j-1] };
3. Initial conditions and boundary situations
// 边界状态初始化
for (int i = 0; i < n + 1; i++) {
D[i][0] = i;
}
for (int j = 0; j < m + 1; j++) {
D[0][j] = j;
}
4. Calculation order
// 计算所有 DP 值
for (int i = 1; i < n + 1; i++)
{
for (int j = 1; j < m + 1; j++)
{
....
}
}
[Example] Complete code
public static int calStringDistance(String charA, String charB)
{
char[] A = charA.toCharArray();
char[] B = charB.toCharArray();
int n = charA.length();
int m = charB.length();
//初始化边界状态
int[][] DP = new int[n+1][m+1];
//初始化边界状态
for (int i = 0; i < A.length+1; i++) DP[i][0] = i;
for (int j = 0; j < B.length+1; j++) DP[0][j] = j;
//计算DP Table
for (int i = 1; i < n + 1; i++)
for (int j = 1; j < m + 1; j++)
{
if(A[i-1]==B[j-1]) DP[i][j] = DP[i-1][j-1];
else DP[i][j] = Math.min(Math.min(DP[i-1][j],DP[i][j-1]),DP[i-1][j-1])+1;
}
return DP[n][m];
}
Reference:
The Levenshtein Distance