1. .*
.
It means match any single character except newline \ n addition, and *
represents zero or more times. So .*
together, they appear to represent any character zero or more times. No ?
representation greedy. For example a.*b
, it will be the longest match to a start to end of the string b. If you use it to search for aabab
it, it will match the entire string aabab
. This is known as greedy matching.
Another example mode src = `. *` , It will match the longest in src = ` start, the longest string ending with`. It is used to search <img src = `` test.jpg` width = `60px` height =` 80px` /> time, will return src = `` test.jpg` width = ` 60px` height =` 80px`
2. .*?
?
* Or + followed when behind, showing a lazy mode. Also known as non-greedy mode. That is, matching as few characters. It means match any number of repetition, but with minimal repeats make the whole premise of a successful match.
a.*?b
Matching the shortest, to a start to end of the string b. If you apply it aabab
, then it will match aab
(first to third character) and ab
(fourth to fifth character).
Another example mode src = `. *? ' , It will match src =` beginning to the end of the `shortest possible string. And the start and end of an intermediate character can not, because * means zero or more . It is used to search <img src = `` test.jpg` width = `60px` height =` 80px` /> time, will return the src = `` .
3. .+?
Above, ?
with the * or + when behind, showing a lazy mode. Also known as non-greedy mode. It means match any number of repetition, but with minimal repeats make the whole premise of a successful match.
a.+?b
Matching the shortest, to a start end of the string b, a and b but should be at least an intermediate character. If you apply it ababccaab
, then it will match abab
(first to fourth character) and aab
(seventh to ninth character). Note that at this time the result is not a match ab
, ab
and aab
. Since the intermediate a and b must be at least one character.
Another example mode src = `. +? ' , It will match src =` beginning to the end of the `shortest possible string. And the beginning and end of the middle must have character, because + 1 to represent more . It is used to search <img src = `` test.jpg` width = `60px` height =` 80px` /> time, will return the src = `` test.jpg` . Note that the .*?
difference at this time will not match src = `` , since src = ` and ` at least one character between.
4. Sample Code
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import org.junit.jupiter.api.Test;
public class TestRegx {
@Test
public void testRegx(){
String str = "<img src=``test.jpg` width=`60px` height=`80px`/>";
String pattern1 = "src=`.*`";
String pattern2 = "src=`.*?`";
String pattern3 = "src=`.+?`";
Pattern p1 = Pattern.compile(pattern1);
Pattern p2 = Pattern.compile(pattern2);
Pattern p3 = Pattern.compile(pattern3);
Matcher m1 = p1.matcher(str);
Matcher m2 = p2.matcher(str);
Matcher m3 = p3.matcher(str);
System.out.println("根据pattern1匹配的结果:");
if (m1.find()) {
for(int i=0; i<=m1.groupCount(); i++){
System.out.println(m1.group(i));
}
}
System.out.println("根据pattern2匹配的结果:");
if (m2.find()) {
for(int i=0; i<=m2.groupCount(); i++){
System.out.println(m2.group(i));
}
}
System.out.println("根据pattern3匹配的结果:");
if (m3.find()) {
for(int i=0; i<=m3.groupCount(); i++){
System.out.println(m3.group(i));
}
}
String[] str1 = p1.split(str);
String[] str2 = p2.split(str);
String[] str3 = p3.split(str);
System.out.println("根据pattern1切分的结果");
for (int i=0; i< str1.length; i++) {
System.out.println(str1[i]);
}
System.out.println("根据pattern2切分的结果");
for (int i=0; i< str2.length; i++) {
System.out.println(str2[i]);
}
System.out.println("根据pattern3切分的结果");
for (int i=0; i< str3.length; i++) {
System.out.println(str3[i]);
}
}
}