transient关键字与ArrayList序列化的实现

transient关键字

transient关键字用于类成员变量，标记该变量不是对象的持久状态的一部分。JVM默认序列化和反序列化创建字节流过程中，自动忽略transient修饰的变量。

transient关键字的使用

User类–实现Serializable接口

public class User implements Serializable {
    private int userId;
    private String userName;
    private transient String account;

    public User(int userId, String userName, String account) {
        this.userId = userId;
        this.userName = userName;
        this.account = account;
    }

    public int getUserId() {
        return userId;
    }

    public void setUserId(int userId) {
        this.userId = userId;
    }

    public String getUserName() {
        return userName;
    }

    public void setUserName(String userName) {
        this.userName = userName;
    }

    public String getAccount() {
        return account;
    }

    public void setAccount(String account) {
        this.account = account;
    }

    @Override
    public String toString() {
        return "User{" +
                "userId=" + userId +
                ", userName='" + userName + '\'' +
                ", account='" + account + '\'' +
                '}';
    }
}

测试类

public class TransientTest {
    public static void main(String[] args) {
        try {
            User user = new User(1, "lzp", "account");
            System.out.println("序列化之前" + user);

            ObjectOutputStream oos = new ObjectOutputStream(new FileOutputStream("/home/ac/workspace/tmp/user.txt"));
            oos.writeObject(user);
            oos.close();

            ObjectInputStream ois = new ObjectInputStream(new FileInputStream("/home/ac/workspace/tmp/user.txt"));
            User readUser = (User) ois.readObject();
            System.out.println("序列化之后" + readUser);
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } catch (ClassNotFoundException e) {
            e.printStackTrace();
        }
    }
}

运行结果

序列化之前User{userId=1, userName='lzp', account='account'}
序列化之后User{userId=1, userName='lzp', account='null'}

为什么使用transient关键字

对类成员变量使用transient关键字的场景，总结有这么几种：

瞬态字段，变量值会随时间以及其他变量值改变而改变。
成员变量涉及到安全信息，不希望通过任意形式（如数据库、字节流）泄露到JVM外部时
出于性能的考虑：一是序列化某些字段没有意义，例如logger引用；二是序列化某个字段过于浪费资源。
对于常用类型如Integer、String等类型都是有Serializable接口标记的，如果某个类实现Serializable接口，并且想要序列化，要求所有成员变量都是可以序列化的。如果某成员变量没有实现Serializable接口，则需要对其使用transient关键字修饰。

自定义序列化transient修饰的成员变量

元素不进行JVM默认的序列化，也可以自己完成序列化。
我们修改User类中的代码，添加writeObject和readObject两个方法代码。

/**
 * 序列化操作
 *
 * @param oos
 * @throws IOException
 */
private void writeObject(ObjectOutputStream oos) throws IOException {
    // 将JVM可以默认序列化的元素序列化
    oos.defaultWriteObject();
    // 自身实现transient修饰元素的序列化
    oos.writeObject(account);
}

/**
 * 反序列化操作
 *
 * @param ois
 * @throws IOException
 * @throws ClassNotFoundException
 */
private void readObject(ObjectInputStream ois) throws IOException, ClassNotFoundException {
    // 将JVM可以默认反序列化的原始反序列化
    ois.defaultReadObject();
    // 自己完成transient修饰元素的反序列化
    this.account = (String) ois.readObject();
}

运行结果

序列化之前User{userId=1, userName='lzp', account='account'}
序列化之后User{userId=1, userName='lzp', account='account'}

到此，即使account变量由transient关键字修饰，我们仍然对它进行了序列化和反序列化操作。

ArrayList序列化与反序列化问题

为什么对底层数组采用transient修饰

阅读ArrayList源码，我们知道ArrayList底层数据结构是Object数组，这个数组正是由transient修饰。

transient Object[] elementData;

既然由transient修饰，自然无法由JVM默认序列化数组。这么做正是出于性能的考虑，底层数组结构决定了数组中必然有很多位置并未存放元素，直接对数组序列化会对资源产生浪费。

ArrayList序列化相关源码分析

ArrayList虽然不能对数组做默认JVM序列化，但是通过定义writeObject和readObject方法，自定义实现了列表元素的序列化与反序列化。

writeObject方法
// 保存ArrayList实例状态到流，即序列化
private void writeObject(java.io.ObjectOutputStream s)
    throws java.io.IOException{
    // 序列化函数计数，以及其他隐藏成员
    int expectedModCount = modCount;
    // 执行默认序列化，expectedModCount可以被JVM默认序列化
    s.defaultWriteObject();

    // 序列化size变量，作为clone()操作的容量
    s.writeInt(size);

    // 按照索引顺序序列化所有列表元素
    for (int i=0; i<size; i++) {
        s.writeObject(elementData[i]);
    }
    // fail-fast机制
    if (modCount != expectedModCount) {
        throw new ConcurrentModificationException();
    }
}

此处我们注意到

size是可以默认序列化的，源码中单独对size进行了序列化。先序列化了数组大小，然后将数组元素序列化出来。
序列化具有fail-fast机制。序列化完数组元素后检测，如果有结构性修改抛异常。

readObject方法

// 从流中重新构建出ArrayList实例，即反序列化
private void readObject(java.io.ObjectInputStream s)
    throws java.io.IOException, ClassNotFoundException {
    // 初始化为空数组
    elementData = EMPTY_ELEMENTDATA; 

    // 反序列化默认变量，包括size
    s.defaultReadObject();

    // 反序列化数组容量--在JDK1.8这一步没有意义
    s.readInt(); // ignored

    if (size > 0) {
        // 类似与clone(),基于size分配数组，而不是capacity
        int capacity = calculateCapacity(elementData, size);
        SharedSecrets.getJavaOISAccess().checkArray(s, Object[].class, capacity);
        ensureCapacityInternal(size);
        
        // 此时得到扩容后的空数组
        Object[] a = elementData;
        // 读取数组所有元素
        for (int i=0; i<size; i++) {
            a[i] = s.readObject();
        }
    }
}

针对于序列化过程中两次序列化size的探讨

【分析】s.defaultWriteObject();这一步是将ArrayList中除了transient的其他数据序列化，而后s.writeInt(size);则是把先把数组大小序列化，然后再把数组中有值的元素一一序列化。对于JDK1.8版本来说，这一举动是没有意义的，正如作者的注释–Ignored。因为对size的再次序列化并不会影响默认序列化的值，反序列化仍然取的是默认序列化size的值。

defaultReadObject() and defaultWriteObject() should be the first method call inside 
readObject(ObjectInputStream o) and writeObject(ObjectOutputStream o). 
It reads and writes all the non transient fields of the class respectively. 
These methods also helps in backward and future compatibility. 
If in future you add some non-transient field to the class and you are trying to deserialize it 
by the older version of class then the defaultReadObject() method will neglect the newly added field, 
similarly if you deserialize the old serialized object by the new version 
then the new non transient field will take default value from JVM i.e. 
if its object then null else if primitive then boolean to false, int to 0
【翻译】
defaultReadObject（）和defaultWriteObject（）应该是readObject（ObjectInputStream o）
和writeObject（ObjectOutputStream o）内部的第一个方法调用。 
它分别读取和写入该类的所有非瞬态字段。 这些方法还有助于向后和将来的兼容性。 
如果将来在类中添加一些非瞬态字段，并尝试通过旧版本的类对它进行反序列化，则defaultReadObject（）方法将忽略新添加的字段，
类似地，如果您使用新的序列化对旧的序列化对象进行反序列化 版本，则新的非瞬态字段将从JVM中获取默认值，
即，如果其对象为null，则为null，否则将其从boolean设置为false，将int设置为0。

这么做的原因是出于兼容性考虑，在之前的JDK版本中，ArrayList实现中是对elementData.length字段单独做了序列化，而size仍然是默认JVM序列化字段。
新的JDK版本中，优化了ArrayList实现，不再序列化length字段。如果直接删除s.writeObject(size)，直接会导致新版本JDK的序列化对象，在低版本JDK中无法正确反序列化。

这种写法看似没有意义，实则保证了版本的兼容性。

大唐雨夜

发布了72 篇原创文章 · 获赞 110 · 访问量 9万+

私信关注