I. Introduction
Don't rush to talk about Regex, let's look at a requirement first, count the number of times a certain character or string appears in the entire string, for example, the string is as follows:
今天是2023年6月29日,北京,天气晴,与昨天不同的是,今天格外的热,也不知道明天会怎么样,是晴天还是阴天呢,具体得到明天才能知道了。
Please count the number of occurrences of the word "day".
There are many ways to achieve the above functions, I believe everyone, here are some examples:
1. Cycle through
This is the most common and the first thing that comes to mind.
val content = "今天是2023年6月29日,北京,天气晴,与昨天不同的是,今天格外的热,也不知道明天会怎么样,是晴天还是阴天呢,具体得到明天才能知道了。"
var count=0
//遍历
content.forEach {
if ('天' == it) {
count++//次数累加
}
}
print(count)
print result
8
2. The count operator in Kotlin
val content = "今天是2023年6月29日,北京,天气晴,与昨天不同的是,今天格外的热,也不知道明天会怎么样,是晴天还是阴天呢,具体得到明天才能知道了。"
val count=content.count { ch -> ch == '天' }
print(count)
print result
8
3. The filter operator in Kotlin
val content = "今天是2023年6月29日,北京,天气晴,与昨天不同的是,今天格外的热,也不知道明天会怎么样,是晴天还是阴天呢,具体得到明天才能知道了。"
val filterContent=content.filter { ch -> ch == '天' }
print(filterContent.length)
print result
8
4. Use regular expressions in Java
val content = "今天是2023年6月29日,北京,天气晴,与昨天不同的是,今天格外的热,也不知道明天会怎么样,是晴天还是阴天呢,具体得到明天才能知道了。"
val pattern = Pattern.compile("天")
val matcher = pattern.matcher(content)
var count = 0
while (matcher.find()) {
count++
}
print(count)
print result
8
5. Use the Regex object
val content = "今天是2023年6月29日,北京,天气晴,与昨天不同的是,今天格外的热,也不知道明天会怎么样,是晴天还是阴天呢,具体得到明天才能知道了。"
val matchResult = Regex("天").findAll(content)
print(matchResult.count())
Of course, there are many implementation methods. Let’s take these five for now. When I saw the fifth implementation method of Regex objects, some irons asked. I didn’t find any merits of Regex at all. Is it simple? ? Compared with Kotlin’s operators, it’s nothing but nothing. It has no advantages at all. Don’t panic, Tiezi. If it’s just a simple text search, Regex has absolutely no advantage. After all, this is just an introduction to it, the tip of the iceberg. function, let’s start slowly~
Two, Regex method enumeration
Through the above preface, we roughly know that Regex can also realize the search function. Invisibly, there is another selection method. Besides, what other functions does it have?
Constructor
Let's take a look at the basic constructor
method |
Parameter Type |
overview |
Regex(pattern: String) |
String |
The regular expression pattern to match. |
Regex(pattern: String, option: RegexOption) |
String,RegexOption |
Creates a regular expression from the specified pattern string and the single options specified. |
Regex(pattern: String, options: Set) |
String,Set |
Creates a regular expression based on the specified pattern string and the specified set of options |
For the construction of a parameter, there is nothing to say, it is a regular expression, which is also the most commonly used by us. As for the latter two, they are relatively less used, but let’s briefly introduce them:
RegexOption is an enumerated type, the specific types are as follows:
parameter |
overview |
IGNORE_CASE |
Enables case-insensitive matching. Case comparison supports Unicode |
MULTILINE |
Enable multiline mode. In multiline mode, the expressions ^ and $ match after or before the line terminator or end of the input sequence, respectively. |
LITERAL |
Enables literal analysis of patterns. No special meaning will be assigned to metacharacters or escape sequences in the input sequence. |
UNIX_LINES |
Enable Unix line mode. In this mode, only '\n' is recognized as a line terminator. |
COMMENTS |
Whitespace and comments are allowed in patterns. |
DOT_MATCHES_ALL |
Mode when expressions are enabled. Matches any character, including line terminators. |
CANON_EQ |
Equivalence is achieved by canonical decomposition. |
You can choose the corresponding parameters according to different situations. As for Set, it is nothing more than multiple RegexOptions.
common method
After understanding the basic structure, let's look at the commonly used methods:
method |
Parameter Type |
overview |
find(input: CharSequence, startIndex: Int = 0) |
CharSequence,Int |
Finds the first matching MatchResult object in the string, starting from index 0 by default. |
findAll(input: CharSequence, startIndex: Int = 0) |
CharSequence,Int |
A sequence of all matching MatchResults in the string, starting from index 0 by default. |
containsMatchIn(input: CharSequence) |
CharSequence |
Returns true if the entered character is contained. |
replace(input: CharSequence, replacement: String) |
CharSequence,String |
Similar to String's replace, the first is the input target character, and the second is the replacement character. |
replaceFirst(input: CharSequence, replacement: String) |
CharSequence,String |
replace the first found character |
matches(input: CharSequence) |
CharSequence |
Whether the input character sequence matches the regular expression |
matchEntire(input: CharSequence) |
CharSequence |
Used to match complete input characters in a pattern |
3. Examples of common methods of Regex
In the last article "Android: This requirement is confused, the product says to achieve rich text echo display", I don't know if you still have an impression. For the interception of rich text, we used Regex, which is simply The content acquisition of rich text is realized, and of course some methods are briefly introduced. Below, we briefly make a use case for each method in the second item.
1、find
find, used to find the result that appears for the first time, for example, we want to find the number that appears for the first time in a string, as follows:
val content = "有这样一串数字2345,还有6789,以及012,我们如何只获取数字2345呢"
val regex = Regex("\\d+")
val matchResult = regex.find(content)
print(matchResult?.value)
print result
2345
2、findAll
findAll, as the name suggests, is to find all the results, or the above case, we changed to findAll
val content = "有这样一串数字2345,还有6789,以及012,我们如何只获取数字2345呢"
val regex = Regex("\\d+")
val matchResult = regex.findAll(content)
matchResult.forEach {
println(it.value)
}
print result
2345
6789
012
2345
Another typical case is the interception of rich text tags. This was exemplified in the previous article. You can read the previous article.
3、containsMatchIn
It is used to determine whether a character is contained, similar to how String is used:
val content = "二流小码农"
val regex = Regex("农")
val regex2 = Regex("中")
val isContains = regex.containsMatchIn(content)
val isContains2 = regex2.containsMatchIn(content)
println(isContains)
println(isContains2)
print result
true
false
4、replace
For replacing relevant content in a string:
val content = "二流小码农"
val regex = Regex("二")
val replaceContent=regex.replace(content,"一")
println(replaceContent)
print result
一流小码农
5、replaceFirst
Used to replace the first match in a string:
val content = "有这样一串数字2345,还有6789,以及012,我们如何只获取数字2345呢"
val regex = Regex("\\d+")
//把第一次出现的数字替换为字母abcd
val replaceContent=regex.replaceFirst(content,"abcd")
println(replaceContent)
print result
有这样一串数字abcd,还有6789,以及012,我们如何只获取数字2345呢
6、matches
Whether the characters used for input match the target content, such as for email verification, mobile phone number verification, etc.:
//邮箱验证
val content = "[email protected]"
val content2 = "11@qq"
val regex = Regex("[a-zA-Z0-9_-]+@[a-zA-Z0-9_-]+(\\.[a-zA-Z0-9_-]+)+")
val matches=regex.matches(content)
val matches2=regex.matches(content2)
println(matches)
println(matches2)
print result
true
false
7、matchEntire
Used to match complete input characters in a pattern.
//匹配数字
val regex = Regex("\\d+")
val matchResult=regex.matchEntire("二流小码农")
val matchResult2=regex.matchEntire("二流小码农666")
val matchResult3=regex.matchEntire("123456")
println(matchResult?.value)
println(matchResult2?.value)
println(matchResult3?.value)
print result
null
null
123456
Four. Summary
Regex is easier to use than Java's Api. If you use non-regular functions, such as search, replace, whether to include, etc., you can use the functions that come with the string. If you want To achieve some more complicated things, such as email verification, mobile phone number verification, etc., then Regex is definitely your first choice.