java中String类中getByte()方法的源码分析

一、背景

1.今天看下String类中的getByte()方法的源码，这个方法的主要作用就是把String字符串转化为字节数组，今天打印出来的是字节数组的每一个字符的ascill码值，具体是在哪一步生成的。

二、具体分析代码

1.首先我们自定义一个String string = "java";这样的字符串，下面我们来跟踪下源码，来具体分析下它的运行轨迹。首先会进入下面的类中，可以看的出返回的就是具体的字节数组。

public byte[] getBytes() {
   return StringCoding.encode(value, 0, value.length);
}

2.第二步

static byte[] encode(char[] ca, int off, int len) {
        String csn = Charset.defaultCharset().name();
        try {
            // use charset name encode() variant which provides caching.
            return encode(csn, ca, off, len);
        } catch (UnsupportedEncodingException x) {
            warnUnsupportedCharset(csn);
        }
        try {
            return encode("ISO-8859-1", ca, off, len);
        } catch (UnsupportedEncodingException x) {
            // If this code is hit during VM initialization, MessageUtils is
            // the only way we will be able to get any kind of error message.
            MessageUtils.err("ISO-8859-1 charset not available: "
                             + x.toString());
            // If we can not find ISO-8859-1 (a required encoding) then things
            // are seriously wrong with the installation.
            System.exit(1);
            return null;
        }
    }

解析：1.首先获取JVM的默认编码，这个默认编码在java虚拟机启动的时候从操作系统里面的locale和charset获取。
2. 如果获取不到默认的JVM默认编码，从字符串所在的文件的编码中获取。
3.如果以上都获取不到，那么直接取UTF-8 编码。

3.第三步

static byte[] encode(String charsetName, char[] ca, int off, int len)
        throws UnsupportedEncodingException
    {
        StringEncoder se = deref(encoder);
        String csn = (charsetName == null) ? "ISO-8859-1" : charsetName;
        if ((se == null) || !(csn.equals(se.requestedCharsetName())
                              || csn.equals(se.charsetName()))) {
            se = null;
            try {
                Charset cs = lookupCharset(csn);
                if (cs != null)
                    se = new StringEncoder(cs, csn);
            } catch (IllegalCharsetNameException x) {}
            if (se == null)
                throw new UnsupportedEncodingException (csn);
            set(encoder, se);
        }
        return se.encode(ca, off, len);
    }

解析：这一步紧着着就se.encode(ca,off,len)方法中，我们继续进去

byte[] encode(char[] ca, int off, int len) {
            int en = scale(len, ce.maxBytesPerChar());
            //ba的值为12
            byte[] ba = new byte[en];
            if (len == 0)
                return ba;
            if (ce instanceof ArrayEncoder) {
                //具体就是在这里产生的ascii的值，分别为106,97，118,97
                int blen = ((ArrayEncoder)ce).encode(ca, off, len, ba);
                //返回具体的acsill码值
                return safeTrim(ba, blen, cs, isTrusted);
            } else {
                ce.reset();
                ByteBuffer bb = ByteBuffer.wrap(ba);
                CharBuffer cb = CharBuffer.wrap(ca, off, len);
                try {
                    CoderResult cr = ce.encode(cb, bb, true);
                    if (!cr.isUnderflow())
                        cr.throwException();
                    cr = ce.flush(bb);
                    if (!cr.isUnderflow())
                        cr.throwException();
                } catch (CharacterCodingException x) {
                    // Substitution is always enabled,
                    // so this shouldn't happen
                    throw new Error(x);
                }
                return safeTrim(ba, bb.position(), cs, isTrusted);
            }
        }

三、这样我们就可以打印出来我们的值

这样我们就把我们的字符串转化为字节数组，对应的值分别为106、97、118、97.

四、结束

1.以上我对这个方法的理解，如有不足，欢迎大家指正。

mingxu.chen

发布了122 篇原创文章 · 获赞 64 · 访问量 5万+

私信关注

java中String类中getByte()方法的源码分析

猜你喜欢