Java's String class Detailed

Java's String class

String class is in addition to the basic types of the most used Java classes, even with more than the basic types. Also the jdk Java classes, there are many optimization

Class definition

public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence{
   /** The value is used for character storage. */
    private final char value[];

    /** Cache the hash code for the string */
    private int hash; // Default to 0

    /** use serialVersionUID from JDK 1.0.2 for interoperability */
    private static final long serialVersionUID = -6849794470754667710L;

    /**
     * Class String is special cased within the Serialization Stream Protocol.
     *
     * A String instance is written into an ObjectOutputStream according to
     * <a href="{@docRoot}/../platform/serialization/spec/output.html">
     * Object Serialization Specification, Section 6.2, "Stream Elements"</a>
     */
    private static final ObjectStreamField[] serialPersistentFields =
        new ObjectStreamField[0];

    /**
     * Initializes a newly created {@code String} object so that it represents
     * an empty character sequence.  Note that use of this constructor is
     * unnecessary since Strings are immutable.
     */
    public String() {
        this.value = "".value;
    }

    /**
     * Initializes a newly created {@code String} object so that it represents
     * the same sequence of characters as the argument; in other words, the
     * newly created string is a copy of the argument string. Unless an
     * explicit copy of {@code original} is needed, use of this constructor is
     * unnecessary since Strings are immutable.
     *
     * @param  original
     *         A {@code String}
     */
    public String(String original) {
        this.value = original.value;
        this.hash = original.hash;
    }
  • Final identification does not allow integrated overload. Jdk also many important classes are final identify, prevent the application inherits overloaded to affect the safety of jdk

  • Inheritance Serializable interface, you can rest assured serialization

  • Comparable interface, can be sorted according to the natural order.

  • CharSequence string of important interfaces

  • char array value. Final modifications.

  • hash field int, hashCode value representing the current, to avoid double counting each hash value

compareTo method implements the interface Comparable

public int compareTo(String anotherString) {
    int len1 = value.length;
    int len2 = anotherString.value.length;
    int lim = Math.min(len1, len2); 
    char v1[] = value;
    char v2[] = anotherString.value;

    int k = 0;
    while (k < lim) {  //也只是循环比较到长度短的那个字符串
        char c1 = v1[k];
        char c2 = v2[k];
        if (c1 != c2) {
            return c1 - c2;
        }
        k++;
    }
    return len1 - len2;  //如果前面的长度字符串都一样, 则长度长的大
}
  • From left to right by one char character size comparison, it can be seen from the code "S"> "ASSSSSSSSSSSSSSS"

  • Comparison to that loop just short length string

  • If the length of the string in front of the same, the length of the large

    Construction method

/**
 * Initializes a newly created {@code String} object so that it represents
 * an empty character sequence.  Note that use of this constructor is
 * unnecessary since Strings are immutable.
 */
public String() {
    this.value = "".value;
}

/**
 * Initializes a newly created {@code String} object so that it represents
 * the same sequence of characters as the argument; in other words, the
 * newly created string is a copy of the argument string. Unless an
 * explicit copy of {@code original} is needed, use of this constructor is
 * unnecessary since Strings are immutable.
 *
 * @param  original
 *         A {@code String}
 */
public String(String original) {
    this.value = original.value;
    this.hash = original.hash;
}
/**
*
*/
 public String(byte bytes[], int offset, int length, Charset charset) {
        if (charset == null)
            throw new NullPointerException("charset");
        checkBounds(bytes, offset, length);
        this.value =  StringCoding.decode(charset, bytes, offset, length);
    }
  • Blank constructor actually generate "" String

  • Incoming reference copy constructor string is really just another way the value and the hash value of a string of other, do not worry about the value of two strings and hash interfere with each other. Because the String class does not modify the values ​​of these two methods and the two private final value is modified, has been unable to modify the

  • Blank constructor hash value is not set, the default value of the hash is used // Default to 0

  • The method of the incoming byte array configuration, how bytes are translated into strings using StringCoding.decode(charset, bytes, offset, length);method

    Modifiers StringCoding class is default and there are static modification of the default method, unfortunately, we can not use the direct method which

StringCoding.decode 方法

static char[] decode(Charset cs, byte[] ba, int off, int len) {
    // (1)We never cache the "external" cs, the only benefit of creating
    // an additional StringDe/Encoder object to wrap it is to share the
    // de/encode() method. These SD/E objects are short-lifed, the young-gen
    // gc should be able to take care of them well. But the best approash
    // is still not to generate them if not really necessary.
    // (2)The defensive copy of the input byte/char[] has a big performance
    // impact, as well as the outgoing result byte/char[]. Need to do the
    // optimization check of (sm==null && classLoader0==null) for both.
    // (3)getClass().getClassLoader0() is expensive
    // (4)There might be a timing gap in isTrusted setting. getClassLoader0()
    // is only chcked (and then isTrusted gets set) when (SM==null). It is
    // possible that the SM==null for now but then SM is NOT null later
    // when safeTrim() is invoked...the "safe" way to do is to redundant
    // check (... && (isTrusted || SM == null || getClassLoader0())) in trim
    // but it then can be argued that the SM is null when the opertaion
    // is started...
    CharsetDecoder cd = cs.newDecoder();
    int en = scale(len, cd.maxCharsPerByte());
    char[] ca = new char[en];
    if (len == 0)
        return ca;
    boolean isTrusted = false;
    if (System.getSecurityManager() != null) {
        if (!(isTrusted = (cs.getClass().getClassLoader0() == null))) {
            ba =  Arrays.copyOfRange(ba, off, off + len);
            off = 0;
        }
    }
    cd.onMalformedInput(CodingErrorAction.REPLACE)
      .onUnmappableCharacter(CodingErrorAction.REPLACE)
      .reset();
    if (cd instanceof ArrayDecoder) {
        int clen = ((ArrayDecoder)cd).decode(ba, off, len, ca);
        return safeTrim(ca, clen, cs, isTrusted);
    } else {
        ByteBuffer bb = ByteBuffer.wrap(ba, off, len);
        CharBuffer cb = CharBuffer.wrap(ca);
        try {
            CoderResult cr = cd.decode(bb, cb, true);
            if (!cr.isUnderflow())
                cr.throwException();
            cr = cd.flush(cb);
            if (!cr.isUnderflow())
                cr.throwException();
        } catch (CharacterCodingException x) {
            // Substitution is always enabled,
            // so this shouldn't happen
            throw new Error(x);
        }
        return safeTrim(ca, cb.position(), cs, isTrusted);
    }
}
  • True byte [] turn into a char [] using CharsetDecoder virtual class, and the object of this class is the character encoding you pass Charset class generated.

    UTF8 look of CharsetDecoder implementation class.

    The class is UTF8 CharsetDecoder internal static class that implements CharsetDecoder and ArrayDecoder interfaces, interface methods is very long, are some of bytes transferred in terms of the character, if you want to understand, you need some knowledge of coding. Catch ends here

    private static class Decoder extends CharsetDecoder implements ArrayDecoder {
        private Decoder(Charset var1) {
            super(var1, 1.0F, 1.0F);
        }
         // 此处省略无关方法.......
          /**
          * 真正的字节转字符的方法
          */
          public int decode(byte[] var1, int var2, int var3, char[] var4) {
                int var5 = var2 + var3;
                int var6 = 0;
                int var7 = Math.min(var3, var4.length);
    
                ByteBuffer var8;
                for(var8 = null; var6 < var7 && var1[var2] >= 0; var4[var6++] = (char)var1[var2++]) {
                }
    
                while(true) {
                    while(true) {
                        while(var2 < var5) {
                            byte var9 = var1[var2++];
                            if (var9 < 0) {
                                byte var10;
                                if (var9 >> 5 != -2 || (var9 & 30) == 0) {
                                    byte var11;
                                    if (var9 >> 4 == -2) {
                                        if (var2 + 1 < var5) {
                                            var10 = var1[var2++];
                                            var11 = var1[var2++];
                                            if (isMalformed3(var9, var10, var11)) {
                                                if (this.malformedInputAction() != CodingErrorAction.REPLACE) {
                                                    return -1;
                                                }
    
                                                var4[var6++] = this.replacement().charAt(0);
                                                var2 -= 3;
                                                var8 = getByteBuffer(var8, var1, var2);
                                                var2 += malformedN(var8, 3).length();
                                            } else {
                                                char var15 = (char)(var9 << 12 ^ var10 << 6 ^ var11 ^ -123008);
                                                if (Character.isSurrogate(var15)) {
                                                    if (this.malformedInputAction() != CodingErrorAction.REPLACE) {
                                                        return -1;
                                                    }
    
                                                    var4[var6++] = this.replacement().charAt(0);
                                                } else {
                                                    var4[var6++] = var15;
                                                }
                                            }
                                        } else {
                                            if (this.malformedInputAction() != CodingErrorAction.REPLACE) {
                                                return -1;
                                            }
    
                                            if (var2 >= var5 || !isMalformed3_2(var9, var1[var2])) {
                                                var4[var6++] = this.replacement().charAt(0);
                                                return var6;
                                            }
    
                                            var4[var6++] = this.replacement().charAt(0);
                                        }
                                    } else if (var9 >> 3 != -2) {
                                        if (this.malformedInputAction() != CodingErrorAction.REPLACE) {
                                            return -1;
                                        }
    
                                        var4[var6++] = this.replacement().charAt(0);
                                    } else if (var2 + 2 < var5) {
                                        var10 = var1[var2++];
                                        var11 = var1[var2++];
                                        byte var12 = var1[var2++];
                                        int var13 = var9 << 18 ^ var10 << 12 ^ var11 << 6 ^ var12 ^ 3678080;
                                        if (!isMalformed4(var10, var11, var12) && Character.isSupplementaryCodePoint(var13)) {
                                            var4[var6++] = Character.highSurrogate(var13);
                                            var4[var6++] = Character.lowSurrogate(var13);
                                        } else {
                                            if (this.malformedInputAction() != CodingErrorAction.REPLACE) {
                                                return -1;
                                            }
    
                                            var4[var6++] = this.replacement().charAt(0);
                                            var2 -= 4;
                                            var8 = getByteBuffer(var8, var1, var2);
                                            var2 += malformedN(var8, 4).length();
                                        }
                                    } else {
                                        if (this.malformedInputAction() != CodingErrorAction.REPLACE) {
                                            return -1;
                                        }
    
                                        int var14 = var9 & 255;
                                        if (var14 <= 244 && (var2 >= var5 || !isMalformed4_2(var14, var1[var2] & 255))) {
                                            ++var2;
                                            if (var2 >= var5 || !isMalformed4_3(var1[var2])) {
                                                var4[var6++] = this.replacement().charAt(0);
                                                return var6;
                                            }
    
                                            var4[var6++] = this.replacement().charAt(0);
                                        } else {
                                            var4[var6++] = this.replacement().charAt(0);
                                        }
                                    }
                                } else {
                                    if (var2 >= var5) {
                                        if (this.malformedInputAction() != CodingErrorAction.REPLACE) {
                                            return -1;
                                        }
    
                                        var4[var6++] = this.replacement().charAt(0);
                                        return var6;
                                    }
    
                                    var10 = var1[var2++];
                                    if (isNotContinuation(var10)) {
                                        if (this.malformedInputAction() != CodingErrorAction.REPLACE) {
                                            return -1;
                                        }
    
                                        var4[var6++] = this.replacement().charAt(0);
                                        --var2;
                                    } else {
                                        var4[var6++] = (char)(var9 << 6 ^ var10 ^ 3968);
                                    }
                                }
                            } else {
                                var4[var6++] = (char)var9;
                            }
                        }
    
                        return var6;
                    }
                }
            }

Conclusion: converted into a string of bytes to decode method requires the use of tools StringCoding class, this method relies on the internal incoming Charset encoding StringDecode static class class method to decode the real turn into the byte string Java. good transfer through the concrete implementation of the interface to define specific coding class, as long as the String-oriented programming interfaces on it, which would also facilitate different coding extension

String of getBytes same method is also transferred to the main work StringEncode particular class to complete the encoding Charset

hashCode method

Rewrite this method, and the value of each character and related

public int hashCode() {
    int h = hash;
    if (h == 0 && value.length > 0) {
        char val[] = value;
        for (int i = 0; i < value.length; i++) {
            h = 31 * h + val[i];   //为何旧值要乘以31
        }
        hash = h;
        }
        return h;
}

Splicing method concat strings and join static methods

concat method

public String concat(String str) {
    int otherLen = str.length();
    if (otherLen == 0) {
        return this;
    }
    int len = value.length;
    char buf[] = Arrays.copyOf(value, len + otherLen);
    str.getChars(buf, len);
    return new String(buf, true);
}
  • Direct copy in memory, a new array, a String object in the new. Thread-safe. Lower performance.

  • Also directly with + stitching.

    Reference https://blog.csdn.net/youanyyou/article/details/78992978 learned this link. + Link and then compiled into byte code or after StringBuiler used to splice, while still using an array concat copy plus new new object to splice, comprehensive results or use the + to splice it, better performance

join static method

public static String join(CharSequence delimiter, CharSequence... elements) {
    Objects.requireNonNull(delimiter);
    Objects.requireNonNull(elements);
    // Number of elements not likely worth Arrays.stream overhead.
    StringJoiner joiner = new StringJoiner(delimiter);
    for (CharSequence cs: elements) {
        joiner.add(cs);
    }
    return joiner.toString();
}

The specific code needs to catch up with the class StringJoiner

public final class StringJoiner {
    private final String prefix;
    private final String delimiter;
    private final String suffix;

    /*
     * StringBuilder value -- at any time, the characters constructed from the
     * prefix, the added element separated by the delimiter, but without the
     * suffix, so that we can more easily add elements without having to jigger
     * the suffix each time.
     */
    private StringBuilder value;
  
  /**
     * Adds a copy of the given {@code CharSequence} value as the next
     * element of the {@code StringJoiner} value. If {@code newElement} is
     * {@code null}, then {@code "null"} is added.
     *
     * @param  newElement The element to add
     * @return a reference to this {@code StringJoiner}
     */
    public StringJoiner add(CharSequence newElement) {
        prepareBuilder().append(newElement);
        return this;
    }

    private StringBuilder prepareBuilder() {
        if (value != null) {
            value.append(delimiter);
        } else {
            value = new StringBuilder().append(prefix);
        }
        return value;
    }

  • Internal found or use StringBuilder to achieve, join entirely a method for a tool easy to use

replace method

public String replace(char oldChar, char newChar) 
  • Replace the use of array traversal
public String replace(CharSequence target, CharSequence replacement)
  • Use regular expressions to replace, regular analysis of the source code in the next article

Format static method, you can change the format string, mainly for the internationalization of the string,

Internal use of the Formatter class, and Formatter is also using regular expressions,

toLowerCase method

public String toLowerCase(Locale locale) 
  • Traversal char array, each character to lowercase using Character.toLowerCase

trim method

Whitespace before and after the traverse, it determines whitespace is used char <=' '(point learned), behind the use of non-blank character substring to intercept judged

substring method

Internal public String(char value[], int offset, int count)construction method to generate a new string will be assigned within the constructor of an array

valueOf Method

public static String valueOf(Object obj) {
    return (obj == null) ? "null" : obj.toString();
}
// 内部使用传入对象的自己的toString方法, 传入对象如果没有重载toString方法, 就使用默认的toString方法. 
public static String valueOf(char data[]) {
    return new String(data);
}
// 根据传入的数组来选择合适的构造方法来生成String对象

public static String valueOf(boolean b) {
    return b ? "true" : "false";
}
// 根据传入布尔值

static copyValueOf方法

public static String copyValueOf(char data[], int offset, int count) {
        return new String(data, offset, count);
    }
// 静态工具方法, 默认使用合适构造方法来截取和生成新新的字符串

native intern method

This method involves the String constant pool of memory and, in particular will explain in other articles.

public native String intern();

Guess you like

Origin www.cnblogs.com/xiezc/p/11913818.html