Article directory
-
- First introduce pinyin4j dependency
- 1. Regular expression to determine whether a string contains letters
- 2. Obtain the initial letters of Chinese characters (usually used for address book retrieval)
- 3. Obtain the abbreviation of the first letter of Chinese Pinyin
- 4. Convert the Chinese characters in the string to Pinyin, leaving the English characters unchanged.
- 5. Chinese characters are converted to the first letter of Chinese Pinyin, and English characters remain unchanged.
- 6. Clean up special characters
We often encounter some scenarios where we deal with strings, initials of Chinese characters, etc.:
- For example, if you want to get the AZ of the customer's initials, the address book effect. "Zhao"->"Z"
- For example, common insurance company abbreviations, bank abbreviations, etc. are all letter compression of Chinese names. You want to generate a certain abbreviation based on the Chinese name, "Program Life" -> "CXRS"
- Chinese characters are converted into Hanyu Pinyin, and English characters remain unchanged. "Qiuyuejava"->"qiuyuejava"
- The first letter of Chinese Pinyin and the English characters remain unchanged. "Programmer a"->CXYa''
Listed below are some commonly used methods, which can be encapsulated into your own tool class for use.
Let’s take a look at the test results first:
First introduce pinyin4j dependency
<!--拼音-->
<dependency>
<groupId>com.belerweb</groupId>
<artifactId>pinyin4j</artifactId>
<version>2.5.0</version>
</dependency>
1. Regular expression to determine whether a string contains letters
Regular expression to determine whether a string contains letters
/**
* 使用正则表达式来判断字符串中是否包含字母
* @param str 待检验的字符串
* @return 返回是否包含
* true: 包含字母 ;false 不包含字母
*/
public static boolean judgeContainsStr(String str) {
String regex=".*[a-zA-Z]+.*";
Matcher m= Pattern.compile(regex).matcher(str);
return m.matches();
}
2. Obtain the initial letters of Chinese characters (usually used for address book retrieval)
Get the first letter of Chinese characters (usually used for address book retrieval)
public static String getAlpha(String chines) {
String pinyinName = "";
char[] nameChar = chines.toCharArray();
HanyuPinyinOutputFormat defaultFormat = new HanyuPinyinOutputFormat();
defaultFormat.setCaseType(HanyuPinyinCaseType.UPPERCASE);
defaultFormat.setToneType(HanyuPinyinToneType.WITHOUT_TONE);
for (int i = 0; i < nameChar.length; i++) {
System.out.println(nameChar[i]);
if (nameChar[i] > 128) {
try {
String[] ac = PinyinHelper.toHanyuPinyinStringArray(nameChar[i], defaultFormat);
if(ac!=null&&ac.length>0){
pinyinName += PinyinHelper.toHanyuPinyinStringArray(nameChar[i], defaultFormat)[0].charAt(0);
}
} catch (BadHanyuPinyinOutputFormatCombination e) {
return "#";
} catch (Exception e){
return "#";
}
} else {
if(judgeContainsStr(String.valueOf(nameChar[i]))){
pinyinName += nameChar[i];
}else{
pinyinName += "#";
}
}
}
return pinyinName;
}
3. Obtain the abbreviation of the first letter of Chinese Pinyin
Get the Chinese Pinyin abbreviation
public static String getAlphaDefaultCode(String chines) {
String pinyinName = "";
char[] nameChar = chines.toCharArray();
HanyuPinyinOutputFormat defaultFormat = new HanyuPinyinOutputFormat();
defaultFormat.setCaseType(HanyuPinyinCaseType.UPPERCASE);
defaultFormat.setToneType(HanyuPinyinToneType.WITHOUT_TONE);
for (int i = 0; i < nameChar.length; i++) {
if (nameChar[i] > 128) {
try {
String[] ac = PinyinHelper.toHanyuPinyinStringArray(nameChar[i], defaultFormat);
if(ac!=null&&ac.length>0){
pinyinName += PinyinHelper.toHanyuPinyinStringArray(nameChar[i], defaultFormat)[0].charAt(0);
}
} catch (BadHanyuPinyinOutputFormatCombination e) {
return "#";
} catch (Exception e){
return "#";
}
} else {
if(judgeContainsStr(String.valueOf(nameChar[i]))){
pinyinName += nameChar[i];
}else{
pinyinName += "#";
}
}
}
if(StringUtils.isBlank(pinyinName)){
pinyinName = "#";
}
return pinyinName;
}
4. Convert the Chinese characters in the string to Pinyin, leaving the English characters unchanged.
Convert the Chinese characters in the string to Pinyin, leaving the English characters unchanged
/**
* 将字符串中的中文转化为拼音,英文字符不变
*
* @param inputString
* 汉字
* @return
*/
public static String getPingYin(String inputString) {
inputString = cleanChar(inputString);
HanyuPinyinOutputFormat format = new HanyuPinyinOutputFormat();
format.setCaseType(HanyuPinyinCaseType.LOWERCASE);
format.setToneType(HanyuPinyinToneType.WITHOUT_TONE);
format.setVCharType(HanyuPinyinVCharType.WITH_V);
String output = "";
if (inputString != null && inputString.length() > 0 && !"null".equals(inputString)) {
char[] input = inputString.trim().toCharArray();
try {
for (int i = 0; i < input.length; i++) {
if (Character.toString(input[i]).matches("[\\u4E00-\\u9FA5]+")) {
String[] temp = PinyinHelper.toHanyuPinyinStringArray(input[i], format);
output += temp[0];
} else
output += Character.toString(input[i]);
}
} catch (BadHanyuPinyinOutputFormatCombination e) {
e.printStackTrace();
}
} else {
return "*";
}
return output;
}
5. Chinese characters are converted to the first letter of Chinese Pinyin, and English characters remain unchanged.
Chinese characters are converted to the first letter of Chinese Pinyin, and the English characters remain unchanged.
/**
* 汉字转换位汉语拼音首字母,英文字符不变
*
* @param chines
* 汉字
* @return 拼音
*/
public static String converterToFirstSpell(String chines) {
chines = cleanChar(chines);
String pinyinName = "";
char[] nameChar = chines.toCharArray();
HanyuPinyinOutputFormat defaultFormat = new HanyuPinyinOutputFormat();
defaultFormat.setCaseType(HanyuPinyinCaseType.UPPERCASE);
defaultFormat.setToneType(HanyuPinyinToneType.WITHOUT_TONE);
for (int i = 0; i < nameChar.length; i++) {
if (nameChar[i] > 128) {
try {
pinyinName += PinyinHelper.toHanyuPinyinStringArray(nameChar[i], defaultFormat)[0].charAt(0);
} catch (BadHanyuPinyinOutputFormatCombination e) {
e.printStackTrace();
}
} else {
pinyinName += nameChar[i];
}
}
return pinyinName;
}
6. Clean up special characters
Clean special characters
/**
* 清理特殊字符
* @param chines
* @return
*/
public static String cleanChar(String chines) {
chines = chines.replaceAll("[\\p{Punct}\\p{Space}]+", ""); // 正则去掉所有字符操作
// 正则表达式去掉所有中文的特殊符号
String regEx = "[`~!@#$%^&*()+=|{}':;',\\[\\].<>/?~!@#¥%……&*()——+|{}<>《》【】‘;:”“’。,、?]";
Pattern pattern = Pattern.compile(regEx);
Matcher matcher = pattern.matcher(chines);
chines = matcher.replaceAll("").trim();
return chines;
}