Pinpointing string of long omitted function nuances

I. Background

There is such a demand: If a string exceeds a certain length, the length of the portion is replaced by ellipses over.

Many people will find this so easy, a little Java-based students can simply write out.

So we have to analyze this simple question.

Second, coding

 

2.1 ideas

Idea is very simple, it is determined whether the size is less than the length of the string, if less than 10, ... can be replaced with excess.

 

2.2 encoding

We want to encode more consideration:

  1. For robustness, we want to verify the parameters;
  2. Also, if a complete write tools can support custom apostrophe;

 

We write tools:

import com.google.common.base.Preconditions;
import org.apache.commons.lang3.StringUtils;

public class StringUtil {

    /**
     * 超过 maxSize 的部分用省略号代替
     *
     * @param originStr 原始字符串
     * @param maxSize   最大长度
     */
    public static String abbreviate(String originStr, int maxSize) {

        return abbreviate(originStr, maxSize, null);
    }

    /**
     * 超过 maxSize 的部分用省略号代替
     *
     * @param originStr    原始字符串
     * @param maxSize      最大长度
     * @param abbrevMarker 省略符
     */
    public static String abbreviate(String originStr, int maxSize, String abbrevMarker) {

        Preconditions.checkArgument(maxSize > 0, "size 必须大于0");

        if (StringUtils.isEmpty(originStr)) {
            return StringUtils.EMPTY;
        }

        String defaultAbbrevMarker = "...";

        if (originStr.length() < maxSize) {
            return originStr;
        }

        return originStr.substring(0, maxSize) + StringUtils.defaultIfEmpty(abbrevMarker, defaultAbbrevMarker);
    }
}

Here With the commons-lang3 bag StringUtils, and Preconditions guava package, if the project was not introduced these packages, you can achieve your own hand is also very simple.

Finished how to verify the correctness of it?

As a qualified program, be sure to write unit tests Well!


public class StringUtilTest {

   @Test
    public void abbreviateLess() {
        String input = "123456789";
        String abbreviate = StringUtil.abbreviate(input, 11);
        Assert.assertEquals(input, abbreviate);
    } 

    @Test
    public void abbreviateCommon() {
        String input = "123456789";
        String abbreviate = StringUtil.abbreviate(input, 3);
        Assert.assertEquals("123...", abbreviate);
    }

    @Test
    public void abbreviateWithMarker() {
        String input = "123456789";
        String abbreviate = StringUtil.abbreviate(input, 3, "***");
        Assert.assertEquals("123***", abbreviate);
    }

    @Test(expected = IllegalArgumentException.class)
    public void abbreviateWithNegativeSize() {
        String input = "123456789";
        String abbreviate = StringUtil.abbreviate(input, -3);
    }

}

Discovery through.

 

2.3 thinking?

If so over, is not it also not much value?

2.3.1 If emoji expression, accounted for two characters, if the interception to the first character will be any problems?

 

Write a single test to test, really a problem.

 

As a good programmer, and the product should not we share what this case how to do it?

Product hypothesis says: This situation put the whole expression do not.

We utility functions to make changes:

/**
     * 超过 maxSize 的部分用省略号代替
     *
     * @param originStr    原始字符串
     * @param maxSize      最大长度
     * @param abbrevMarker 省略符
     */
    public static String abbreviate(String originStr, int maxSize, String abbrevMarker) {

        Preconditions.checkArgument(maxSize > 0, "size 必须大于0");

        if (StringUtils.isEmpty(originStr)) {
            return StringUtils.EMPTY;
        }

        String defaultAbbrevMarker = "...";

        if (originStr.length() < maxSize) {
            return originStr;
        }

        // 截取前maxSize 个字符
        String head = originStr.substring(0, maxSize);

        // 最后一个字符是高代理项,则移除掉
        char lastChar = head.charAt(head.length() - 1);
        if (Character.isHighSurrogate(lastChar)) {
            head = head.substring(0, head.length() - 1);
        }


        return head + StringUtils.defaultIfEmpty(abbrevMarker, defaultAbbrevMarker);
    }

Re-run unit tests, has been found to effect what we want.

About unicode details, refer to Wikipedia, the relevant character usage refer to the following article:

https://www.ibm.com/developerworks/cn/java/j-unicode/

 

2.3.2 How can I write better?

The above approach appears to be very perfect, but how better to write it? what? It still does not work? !

We look at the source code commons-lang3 StringUtils tools like:

    /**
     * <p>Returns either the passed in CharSequence, or if the CharSequence is
     * empty or {@code null}, the value of {@code defaultStr}.</p>
     *
     * <pre>
     * StringUtils.defaultIfEmpty(null, "NULL")  = "NULL"
     * StringUtils.defaultIfEmpty("", "NULL")    = "NULL"
     * StringUtils.defaultIfEmpty(" ", "NULL")   = " "
     * StringUtils.defaultIfEmpty("bat", "NULL") = "bat"
     * StringUtils.defaultIfEmpty("", null)      = null
     * </pre>
     * @param <T> the specific kind of CharSequence
     * @param str  the CharSequence to check, may be null
     * @param defaultStr  the default CharSequence to return
     *  if the input is empty ("") or {@code null}, may be null
     * @return the passed in CharSequence, or the default
     * @see StringUtils#defaultString(String, String)
     */
    public static <T extends CharSequence> T defaultIfEmpty(final T str, final T defaultStr) {
        return isEmpty(str) ? defaultStr : str;
    }

Can be found, the source is given a common return value of the parameter, it particularly easy to use.

So we make the following changes:

import com.google.common.base.Preconditions;
import org.apache.commons.lang3.StringUtils;

public class StringUtil {

    /**
     * 超过 maxSize 的部分用省略号代替
     * <p>
     * 使用范例:
     * 1 不超过取所有
     * StringUtil.abbreviate("123456789", 11) = "123456789"
     * <p>
     * 2 超过最大长度截取并补充省略号
     * StringUtil.abbreviate("123456789", 3) = "123..."
     * <p>
     * 3 emoji表情被截断则丢弃前面的字符(整个表情)
     * StringUtil.abbreviate("123456789??", 10) = "123456789..."
     *
     * @param originStr 原始字符串
     * @param maxSize   最大长度
     */
    public static String abbreviate(String originStr, int maxSize) {

        return abbreviate(originStr, maxSize, null);
    }

    /**
     * 超过 maxSize 的部分用省略号代替
     * <p>
     * 使用范例:
     * <p>
     * StringUtil.abbreviate("123456789"", 3, "***") = "123..."
     *
     * @param originStr    原始字符串
     * @param maxSize      最大长度
     * @param abbrevMarker 省略符
     */
    public static String abbreviate(String originStr, int maxSize, String abbrevMarker) {

        Preconditions.checkArgument(maxSize > 0, "size 必须大于0");

        if (StringUtils.isEmpty(originStr)) {
            return StringUtils.EMPTY;
        }

        String defaultAbbrevMarker = "...";

        if (originStr.length() < maxSize) {
            return originStr;
        }

        // 截取前maxSize 个字符
        String head = originStr.substring(0, maxSize);

        // 最后一个字符是高代理项,则移除掉
        char lastChar = head.charAt(head.length() - 1);
        if (Character.isHighSurrogate(lastChar)) {
            head = head.substring(0, head.length() - 1);
        }


        return head + StringUtils.defaultIfEmpty(abbrevMarker, defaultAbbrevMarker);
    }
}

Most good programmers, we have the time of writing tools, you can put the common input and output tools are given in the notes, user-friendly.

 

Third, the summary

This simple function, it is easy to achieve, not so easy to write.

You can add parameter validation, add the test unit, with comments, facial expressions plus emoji handling problems.

 

Many beginners always think a lot of the problem is very simple, but simple function codes can write rigorous, it is a question worth considering.

Also I hope that we can absorb all aspects of the source code of the essence, and not take it for granted read the source code, source notes, the source of design patterns, source code written ideas are very valuable things.

Programming in the nuances pinpointing hope that when the usual programming can develop good habits, and strive to do a good programmer to pursue.

 

 

 

 

Published 379 original articles · won praise 862 · Views 1.32 million +

Guess you like

Origin blog.csdn.net/w605283073/article/details/102734268