APK文件完整性官网描述

受完整性保护的内容
为了保护 APK 内容，APK 包含以下 4 个部分：

ZIP 条目的内容（从偏移量 0 处开始一直到“APK 签名分块”的起始位置）
APK 签名分块
ZIP 中央目录
ZIP 中央目录结尾

apksections

签名后的各个 APK 部分 APK 签名方案 v2 负责保护第 1、3、4 部分的完整性，以及第 2 部分包含的“APK 签名方案 v2 分块”中的 signed data 分块的完整性。

第 1、3 和 4 部分的完整性通过其内容的一个或多个摘要实施保护，这些摘要存储在 signed data 分块中，而这些分块则通过一个或多个签名实施保护。

第 1、3 和 4 部分的摘要采用以下计算方式，类似于两级 Merkle 树。每个部分都会被拆分成多个大小为 1MB（220 个字节）的连续块。每个部分的最后一个块可能会短一些。每个块的摘要均通过字节 0xa5 的串联、块的长度（采用小端字节序的 uint32 值，以字节数计）和块的内容进行计算。顶级摘要通过字节 0x5a 的串联、块数（采用小端字节序的 uint32 值）以及块的摘要的串联（按照块在 APK 中显示的顺序）进行计算。摘要以分块方式计算，以便通过并行处理来加快计算速度。
integryverity

APK 摘要

由于第 4 部分（ZIP 中央目录结尾）包含“ZIP 中央目录”的偏移量，因此该部分的保护比较复杂。当“APK 签名分块”的大小发生变化（例如，添加了新签名）时，偏移量也会随之改变。因此，在通过“ZIP 中央目录结尾”计算摘要时，必须将包含“ZIP 中央目录”偏移量的字段视为包含“APK 签名分块”的偏移量。

具体代码实现

具体代码实现是在ApkSigningBlockUtils类的verifyIntegrity(contentDigests, apk, signatureInfo)中，代码如下：

    static void verifyIntegrity(
            Map<Integer, byte[]> expectedDigests,
            RandomAccessFile apk,
            SignatureInfo signatureInfo) throws SecurityException {
    
    
        if (expectedDigests.isEmpty()) {
    
    
            throw new SecurityException("No digests provided");
        }

        boolean neverVerified = true;

        Map<Integer, byte[]> expected1MbChunkDigests = new ArrayMap<>();
        if (expectedDigests.containsKey(CONTENT_DIGEST_CHUNKED_SHA256)) {
    
    
            expected1MbChunkDigests.put(CONTENT_DIGEST_CHUNKED_SHA256,
                    expectedDigests.get(CONTENT_DIGEST_CHUNKED_SHA256));
        }
        if (expectedDigests.containsKey(CONTENT_DIGEST_CHUNKED_SHA512)) {
    
    
            expected1MbChunkDigests.put(CONTENT_DIGEST_CHUNKED_SHA512,
                    expectedDigests.get(CONTENT_DIGEST_CHUNKED_SHA512));
        }
        if (!expected1MbChunkDigests.isEmpty()) {
    
    
            try {
    
    
                verifyIntegrityFor1MbChunkBasedAlgorithm(expected1MbChunkDigests, apk.getFD(),
                        signatureInfo);
                neverVerified = false;
            } catch (IOException e) {
    
    
                throw new SecurityException("Cannot get FD", e);
            }
        }

        if (expectedDigests.containsKey(CONTENT_DIGEST_VERITY_CHUNKED_SHA256)) {
    
    
            verifyIntegrityForVerityBasedAlgorithm(
                    expectedDigests.get(CONTENT_DIGEST_VERITY_CHUNKED_SHA256), apk, signatureInfo);
            neverVerified = false;
        }

        if (neverVerified) {
    
    
            throw new SecurityException("No known digest exists for integrity check");
        }
    }

参数expectedDigests是在v2分块中signer序列中最好算法对应的摘要值。它是通过验证签名时返回的，可以参考之前的文章Android APK文件的签名V2查找、验证。
如果expectedDigests中包含CONTENT_DIGEST_CHUNKED_SHA256或者CONTENT_DIGEST_CHUNKED_SHA512两者中的一个或者全部包括，会将它包括的算法及摘要值放入expected1MbChunkDigests变量中。接下来就要对其中的摘要算法、摘要值来验证文件的完整性。
接下来如果expectedDigests中包括CONTENT_DIGEST_VERITY_CHUNKED_SHA256摘要算法，则会构建Merkle树，并且对根执行摘要算法，得到摘要与从v2分块中得到的摘要值进行比对，如果相等则认为文件完整，否则会报异常。

针对文件块(分成1M块)验证文件完整性

它的代码是在verifyIntegrityFor1MbChunkBasedAlgorithm()中，

    private static void verifyIntegrityFor1MbChunkBasedAlgorithm(
            Map<Integer, byte[]> expectedDigests,
            FileDescriptor apkFileDescriptor,
            SignatureInfo signatureInfo) throws SecurityException {
    
    
        int[] digestAlgorithms = new int[expectedDigests.size()];
        int digestAlgorithmCount = 0;
        for (int digestAlgorithm : expectedDigests.keySet()) {
    
    
            digestAlgorithms[digestAlgorithmCount] = digestAlgorithm;
            digestAlgorithmCount++;
        }
        byte[][] actualDigests;
        try {
    
    
            actualDigests = computeContentDigestsPer1MbChunk(digestAlgorithms, apkFileDescriptor,
                    signatureInfo);
        } catch (DigestException e) {
    
    
            throw new SecurityException("Failed to compute digest(s) of contents", e);
        }
        for (int i = 0; i < digestAlgorithms.length; i++) {
    
    
            int digestAlgorithm = digestAlgorithms[i];
            byte[] expectedDigest = expectedDigests.get(digestAlgorithm);
            byte[] actualDigest = actualDigests[i];
            if (!MessageDigest.isEqual(expectedDigest, actualDigest)) {
    
    
                throw new SecurityException(
                        getContentDigestAlgorithmJcaDigestAlgorithm(digestAlgorithm)
                                + " digest of contents did not verify");
            }
        }
    }

可以看到，digestAlgorithms数组中存的是摘要方法，actualDigests数组得到的是不同的摘要算法的摘要值，它是通过computeContentDigestsPer1MbChunk()方法计算出来的。最后，通过计算得到的摘要和之前从v2分块中得到的进行比对，如果不等，就会报出异常。
可见这块，主要就是理解computeContentDigestsPer1MbChunk()方法，它就是根据前面官网描述的算法来进行计算的。

computeContentDigestsPer1MbChunk()

下面就看看computeContentDigestsPer1MbChunk()方法：

    public static byte[][] computeContentDigestsPer1MbChunk(int[] digestAlgorithms,
            FileDescriptor apkFileDescriptor, SignatureInfo signatureInfo) throws DigestException {
    
    
        // We need to verify the integrity of the following three sections of the file:
        // 1. Everything up to the start of the APK Signing Block.
        // 2. ZIP Central Directory.
        // 3. ZIP End of Central Directory (EoCD).
        // Each of these sections is represented as a separate DataSource instance below.

        // To handle large APKs, these sections are read in 1 MB chunks using memory-mapped I/O to
        // avoid wasting physical memory. In most APK verification scenarios, the contents of the
        // APK are already there in the OS's page cache and thus mmap does not use additional
        // physical memory.

        DataSource beforeApkSigningBlock =
                DataSource.create(apkFileDescriptor, 0, signatureInfo.apkSigningBlockOffset);
        DataSource centralDir =
                DataSource.create(
                        apkFileDescriptor, signatureInfo.centralDirOffset,
                        signatureInfo.eocdOffset - signatureInfo.centralDirOffset);

        // For the purposes of integrity verification, ZIP End of Central Directory's field Start of
        // Central Directory must be considered to point to the offset of the APK Signing Block.
        ByteBuffer eocdBuf = signatureInfo.eocd.duplicate();
        eocdBuf.order(ByteOrder.LITTLE_ENDIAN);
        ZipUtils.setZipEocdCentralDirectoryOffset(eocdBuf, signatureInfo.apkSigningBlockOffset);
        DataSource eocd = new ByteBufferDataSource(eocdBuf);

        return computeContentDigestsPer1MbChunk(digestAlgorithms,
                new DataSource[]{
    
    beforeApkSigningBlock, centralDir, eocd});
    }

需要参与验证的有三部分，签名块之前的zip条目内容、中央目录、中央目录尾部。
参数signatureInfo包含各个数据块的偏移量。所以分成了三个数据源beforeApkSigningBlock、centralDir、eocd。需要注意的是，中央目录结尾中的中央目录偏移量现在是添加上签名块的偏移量，所以现在将它还原成没添加签名块的偏移量。还原偏移量的操作是由ZipUtils.setZipEocdCentralDirectoryOffset(eocdBuf, signatureInfo.apkSigningBlockOffset)实现的。
这里是将三个数据块都封装成DataSource类。DataSource类有一个feedIntoDataDigester(DataDigester md, long offset, int size)方法，它的第一个参数是DataDigester类对象，它用来生成摘要的。因为数据块都要分成1M字节大小，所以还需要数据中哪些数据需要用来生成摘要，所以后面两个参数就是来定位DataSource中的数据。
其实这里的DataSource类对象，根据文件是否放在增量文件系统上，来决定是ReadFileDataSource还是MemoryMappedFileDataSource对象。MemoryMappedFileDataSource是用内存映射来实现的数据读取，ReadFileDataSource则是使用的pread系统调用来实现读取文件偏移位置的数，它比内存映射速度慢，但是更安全。
接着看下看一下computeContentDigestsPer1MbChunk的重载方法，代码有点长，分开看，看一下它的第一段：

    private static byte[][] computeContentDigestsPer1MbChunk(
            int[] digestAlgorithms,
            DataSource[] contents) throws DigestException {
    
    
        // For each digest algorithm the result is computed as follows:
        // 1. Each segment of contents is split into consecutive chunks of 1 MB in size.
        //    The final chunk will be shorter iff the length of segment is not a multiple of 1 MB.
        //    No chunks are produced for empty (zero length) segments.
        // 2. The digest of each chunk is computed over the concatenation of byte 0xa5, the chunk's
        //    length in bytes (uint32 little-endian) and the chunk's contents.
        // 3. The output digest is computed over the concatenation of the byte 0x5a, the number of
        //    chunks (uint32 little-endian) and the concatenation of digests of chunks of all
        //    segments in-order.

        long totalChunkCountLong = 0;
        for (DataSource input : contents) {
    
    
            totalChunkCountLong += getChunkCount(input.size());
        }
        if (totalChunkCountLong >= Integer.MAX_VALUE / 1024) {
    
    
            throw new DigestException("Too many chunks: " + totalChunkCountLong);
        }
        int totalChunkCount = (int) totalChunkCountLong;

        byte[][] digestsOfChunks = new byte[digestAlgorithms.length][];
        for (int i = 0; i < digestAlgorithms.length; i++) {
    
    
            int digestAlgorithm = digestAlgorithms[i];
            int digestOutputSizeBytes = getContentDigestAlgorithmOutputSizeBytes(digestAlgorithm);
            byte[] concatenationOfChunkCountAndChunkDigests =
                    new byte[5 + totalChunkCount * digestOutputSizeBytes];
            concatenationOfChunkCountAndChunkDigests[0] = 0x5a;
            setUnsignedInt32LittleEndian(
                    totalChunkCount,
                    concatenationOfChunkCountAndChunkDigests,
                    1);
            digestsOfChunks[i] = concatenationOfChunkCountAndChunkDigests;
        }

getChunkCount(input.size())是每个数据源按照1M字节分块，得到数据源的块数。上面我们知道，目前是三大数据源，totalChunkCountLong 就是它们按照1M字节分块分的总块数。每个数据源最后一个块可能不够1M，算一个块数。
下面是根据算法数量，来拼第一层数据摘要块。
在循环中，通过摘要算法来得到摘要的长度。这是通过getContentDigestAlgorithmOutputSizeBytes(digestAlgorithm)方法得到。得到摘要数据长度之后，再加上多少个摘要，就能知道摘要拼接到一起的长度。因为拼接的数据开头是0x5a开头，接着4个字节是摘要的数量，所以摘要数据块的长度要加上5。然后将前面五个字节赋值。这样按照算法就将拼接摘要数据块放入digestsOfChunks 数组中。
了解一下摘要的长度，看一下getContentDigestAlgorithmOutputSizeBytes(digestAlgorithm)：

    private static int getContentDigestAlgorithmOutputSizeBytes(int digestAlgorithm) {
    
    
        switch (digestAlgorithm) {
    
    
            case CONTENT_DIGEST_CHUNKED_SHA256:
            case CONTENT_DIGEST_VERITY_CHUNKED_SHA256:
                return 256 / 8;
            case CONTENT_DIGEST_CHUNKED_SHA512:
                return 512 / 8;
            default:
                throw new IllegalArgumentException(
                        "Unknown content digest algorthm: " + digestAlgorithm);
        }
    }

可见，算法如果是SHA256时，摘要数据长度为32；如果为SHA512时，摘要长度为64。

computeContentDigestsPer1MbChunk的重载方法，看一下它的第二段代码：

        byte[] chunkContentPrefix = new byte[5];
        chunkContentPrefix[0] = (byte) 0xa5;
        int chunkIndex = 0;
        MessageDigest[] mds = new MessageDigest[digestAlgorithms.length];
        for (int i = 0; i < digestAlgorithms.length; i++) {
    
    
            String jcaAlgorithmName =
                    getContentDigestAlgorithmJcaDigestAlgorithm(digestAlgorithms[i]);
            try {
    
    
                mds[i] = MessageDigest.getInstance(jcaAlgorithmName);
            } catch (NoSuchAlgorithmException e) {
    
    
                throw new RuntimeException(jcaAlgorithmName + " digest not supported", e);
            }
        }
        // TODO: Compute digests of chunks in parallel when beneficial. This requires some research
        // into how to parallelize (if at all) based on the capabilities of the hardware on which
        // this code is running and based on the size of input.
        DataDigester digester = new MultipleDigestDataDigester(mds);
        int dataSourceIndex = 0;
        for (DataSource input : contents) {
    
    
            long inputOffset = 0;
            long inputRemaining = input.size();
            while (inputRemaining > 0) {
    
    
                int chunkSize = (int) Math.min(inputRemaining, CHUNK_SIZE_BYTES);
                setUnsignedInt32LittleEndian(chunkSize, chunkContentPrefix, 1);
                for (int i = 0; i < mds.length; i++) {
    
    
                    mds[i].update(chunkContentPrefix);
                }
                try {
    
    
                    input.feedIntoDataDigester(digester, inputOffset, chunkSize);
                } catch (IOException e) {
    
    
                    throw new DigestException(
                            "Failed to digest chunk #" + chunkIndex + " of section #"
                                    + dataSourceIndex,
                            e);
                }
                for (int i = 0; i < digestAlgorithms.length; i++) {
    
    
                    int digestAlgorithm = digestAlgorithms[i];
                    byte[] concatenationOfChunkCountAndChunkDigests = digestsOfChunks[i];
                    int expectedDigestSizeBytes =
                            getContentDigestAlgorithmOutputSizeBytes(digestAlgorithm);
                    MessageDigest md = mds[i];
                    int actualDigestSizeBytes =
                            md.digest(
                                    concatenationOfChunkCountAndChunkDigests,
                                    5 + chunkIndex * expectedDigestSizeBytes,
                                    expectedDigestSizeBytes);
                    if (actualDigestSizeBytes != expectedDigestSizeBytes) {
    
    
                        throw new RuntimeException(
                                "Unexpected output size of " + md.getAlgorithm() + " digest: "
                                        + actualDigestSizeBytes);
                    }
                }
                inputOffset += chunkSize;
                inputRemaining -= chunkSize;
                chunkIndex++;
            }
            dataSourceIndex++;
        }

这块代码就是要计算摘要值了。
因为分割的数据块计算摘要时，要在它前面固定加上0xa5和4个字节数据块长度，所以这里生成一个5个字节的前缀chunkContentPrefix，并将第一个字节设置为0xa5。
接着通过循环，得到对应算法的Jca算法名。再通过它得到计算摘要的MessageDigest类对象。
接着开始for循环数据源，先得到数据块大小chunkSize，inputRemaining是数据源剩余数据，有可能剩余数据不够1M，所以这里通过Math.min(inputRemaining, CHUNK_SIZE_BYTES)，取它俩的最小值。接着就将数据块字节数放入chunkContentPrefix的第1-4字节。
接着就将每个数据块的前缀更新到摘要内容中，紧接着就调用数据源方法feedIntoDataDigester(digester, inputOffset, chunkSize)来将数据块内容更新到摘要内容中。
紧接着又循环算法，通过MessageDigest类对象的digest()方法将对应的摘要值放入digestsOfChunks[i]中对应块的偏移位置。
接着设置变量值，计算下一个数据块的摘要。直到三个数据源都执行完毕。
这里看一下，数据源是怎么将数据，更新到摘要内容中。这里是调用的input.feedIntoDataDigester(digester, inputOffset, chunkSize)实现的，input前面说了，它实际可能是ReadFileDataSource或者MemoryMappedFileDataSource对象，这里拿MemoryMappedFileDataSource对象来说一下，看一下它的代码：

    @Override
    public void feedIntoDataDigester(DataDigester md, long offset, int size)
            throws IOException, DigestException {
    
    
        // IMPLEMENTATION NOTE: After a lot of experimentation, the implementation of this
        // method was settled on a straightforward mmap with prefaulting.
        //
        // This method is not using FileChannel.map API because that API does not offset a way
        // to "prefault" the resulting memory pages. Without prefaulting, performance is about
        // 10% slower on small to medium APKs, but is significantly worse for APKs in 500+ MB
        // range. FileChannel.load (which currently uses madvise) doesn't help. Finally,
        // invoking madvise (MADV_SEQUENTIAL) after mmap with prefaulting wastes quite a bit of
        // time, which is not compensated for by faster reads.

        // We mmap the smallest region of the file containing the requested data. mmap requires
        // that the start offset in the file must be a multiple of memory page size. We thus may
        // need to mmap from an offset less than the requested offset.
        long filePosition = mFilePosition + offset;
        long mmapFilePosition =
                (filePosition / MEMORY_PAGE_SIZE_BYTES) * MEMORY_PAGE_SIZE_BYTES;
        int dataStartOffsetInMmapRegion = (int) (filePosition - mmapFilePosition);
        long mmapRegionSize = size + dataStartOffsetInMmapRegion;
        long mmapPtr = 0;
        try {
    
    
            mmapPtr = Os.mmap(
                    0, // let the OS choose the start address of the region in memory
                    mmapRegionSize,
                    OsConstants.PROT_READ,
                    OsConstants.MAP_SHARED | OsConstants.MAP_POPULATE, // "prefault" all pages
                    mFd,
                    mmapFilePosition);
            ByteBuffer buf = new DirectByteBuffer(
                    size,
                    mmapPtr + dataStartOffsetInMmapRegion,
                    mFd,  // not really needed, but just in case
                    null, // no need to clean up -- it's taken care of by the finally block
                    true  // read only buffer
                    );
            md.consume(buf);
        } catch (ErrnoException e) {
    
    
            throw new IOException("Failed to mmap " + mmapRegionSize + " bytes", e);
        } finally {
    
    
            if (mmapPtr != 0) {
    
    
                try {
    
    
                    Os.munmap(mmapPtr, mmapRegionSize);
                } catch (ErrnoException ignored) {
    
     }
            }
        }
    }

可以看到这块是通过Os.mmap()方法实现内存映射得到内存地址mmapPtr。让后封装成DirectByteBuffer对象，再调用md.consume(buf)将数据读取到生成摘要内容中。为了实现内存页对齐，还通过一番计算，得到内存映射的文件的位置，得到与实际数据的偏移量，所以取数据时，也需要将偏移量加上。
这里的md实际是MultipleDigestDataDigester对象，它是包含多个MessageDigest对象的，看下它的consume()方法：

    private static class MultipleDigestDataDigester implements DataDigester {
    
    
        private final MessageDigest[] mMds;

        MultipleDigestDataDigester(MessageDigest[] mds) {
    
    
            mMds = mds;
        }

        @Override
        public void consume(ByteBuffer buffer) {
    
    
            buffer = buffer.slice();
            for (MessageDigest md : mMds) {
    
    
                buffer.position(0);
                md.update(buffer);
            }
        }
    }

可见，这块是将数据都调用MessageDigest 对象md将数据更新到生成摘要内容中。

computeContentDigestsPer1MbChunk的重载方法，看一下它的最后一段代码：

        byte[][] result = new byte[digestAlgorithms.length][];
        for (int i = 0; i < digestAlgorithms.length; i++) {
    
    
            int digestAlgorithm = digestAlgorithms[i];
            byte[] input = digestsOfChunks[i];
            String jcaAlgorithmName = getContentDigestAlgorithmJcaDigestAlgorithm(digestAlgorithm);
            MessageDigest md;
            try {
    
    
                md = MessageDigest.getInstance(jcaAlgorithmName);
            } catch (NoSuchAlgorithmException e) {
    
    
                throw new RuntimeException(jcaAlgorithmName + " digest not supported", e);
            }
            byte[] output = md.digest(input);
            result[i] = output;
        }
        return result;
    }

现在各个数据源的摘要按照摘要算法都已经在digestsOfChunks数组中了。现在需要对它再做一次摘要算法，生成摘要。
所以这块就调用新生成的MessageDigest对象的digest(input)生成摘要值，按照数组中算法的次序，放入result 中，并返回。

Merkle树根的摘要比对

它是通过verifyIntegrityForVerityBasedAlgorithm()来实现的

    private static void verifyIntegrityForVerityBasedAlgorithm(
            byte[] expectedDigest,
            RandomAccessFile apk,
            SignatureInfo signatureInfo) throws SecurityException {
    
    
        try {
    
    
            byte[] expectedRootHash = parseVerityDigestAndVerifySourceLength(expectedDigest,
                    apk.length(), signatureInfo);
            VerityBuilder.VerityResult verity = VerityBuilder.generateApkVerityTree(apk,
                    signatureInfo, new ByteBufferFactory() {
    
    
                        @Override
                        public ByteBuffer create(int capacity) {
    
    
                            return ByteBuffer.allocate(capacity);
                        }
                    });
            if (!Arrays.equals(expectedRootHash, verity.rootHash)) {
    
    
                throw new SecurityException("APK verity digest of contents did not verify");
            }
        } catch (DigestException | IOException | NoSuchAlgorithmException e) {
    
    
            throw new SecurityException("Error during verification", e);
        }
    }

参数expectedDigest是从v2分块中得到的摘要值，parseVerityDigestAndVerifySourceLength()得到摘要值。
VerityBuilder.generateApkVerityTree()通过构建Merkle树，然后对其树根执行摘要算法，得到摘要值。得到摘要的值就在VerityBuilder.VerityResult类的成员rootHash中。
如果两个摘要不等，则认为验证失败。

得到v2分块中CONTENT_DIGEST_VERITY_CHUNKED_SHA256算法摘要值

    /**
     * Return the verity digest only if the length of digest content looks correct.
     * When verity digest is generated, the last incomplete 4k chunk is padded with 0s before
     * hashing. This means two almost identical APKs with different number of 0 at the end will have
     * the same verity digest. To avoid this problem, the length of the source content (excluding
     * Signing Block) is appended to the verity digest, and the digest is returned only if the
     * length is consistent to the current APK.
     */
    static byte[] parseVerityDigestAndVerifySourceLength(
            byte[] data, long fileSize, SignatureInfo signatureInfo) throws SecurityException {
    
    
        // FORMAT:
        // OFFSET       DATA TYPE  DESCRIPTION
        // * @+0  bytes uint8[32]  Merkle tree root hash of SHA-256
        // * @+32 bytes int64      Length of source data
        int kRootHashSize = 32;
        int kSourceLengthSize = 8;

        if (data.length != kRootHashSize + kSourceLengthSize) {
    
    
            throw new SecurityException("Verity digest size is wrong: " + data.length);
        }
        ByteBuffer buffer = ByteBuffer.wrap(data).order(ByteOrder.LITTLE_ENDIAN);
        buffer.position(kRootHashSize);
        long expectedSourceLength = buffer.getLong();

        long signingBlockSize = signatureInfo.centralDirOffset
                - signatureInfo.apkSigningBlockOffset;
        if (expectedSourceLength != fileSize - signingBlockSize) {
    
    
            throw new SecurityException("APK content size did not verify");
        }

        return Arrays.copyOfRange(data, 0, kRootHashSize);
    }

v2分块中得到的值是40字节，前32字节是摘要值，后8字节是生成摘要数据的内容的长度。
apk中生成该摘要的内容是不包括签名分块的数据的，所以比长度时，需要将签名分块的长度去掉。
上面方法就很明显了，如果长度对不上，就报出异常。最后将前32字节内容取出返回。

构建Merkle树，得到树根摘要值

这块的实现是在VerityBuilder.generateApkVerityTree()中，最后它的实现在generateVerityTreeInternal()中：

    @NonNull
    private static VerityResult generateVerityTreeInternal(@NonNull RandomAccessFile apk,
            @NonNull ByteBufferFactory bufferFactory, @Nullable SignatureInfo signatureInfo)
            throws IOException, SecurityException, NoSuchAlgorithmException, DigestException {
    
    
        long signingBlockSize =
                signatureInfo.centralDirOffset - signatureInfo.apkSigningBlockOffset;
        long dataSize = apk.length() - signingBlockSize;
        int[] levelOffset = calculateVerityLevelOffset(dataSize);
        int merkleTreeSize = levelOffset[levelOffset.length - 1];

        ByteBuffer output = bufferFactory.create(
                merkleTreeSize
                + CHUNK_SIZE_BYTES);  // maximum size of apk-verity metadata
        output.order(ByteOrder.LITTLE_ENDIAN);
        ByteBuffer tree = slice(output, 0, merkleTreeSize);
        byte[] apkRootHash = generateVerityTreeInternal(apk, signatureInfo, DEFAULT_SALT,
                levelOffset, tree);
        return new VerityResult(output, merkleTreeSize, apkRootHash);
    }

先得到生成Merkle树的数据内容的长度dataSize，主要是将APK中签名分块的长度去掉。
然后通过dataSize得到树的从根到该层的数据大小。它是通过calculateVerityLevelOffset(dataSize)得到的。所以最后一层，对应数组的最后一个，取出得到merkleTreeSize 就是树的大小。
Merkle树在这里使用ByteBuffer 类对象output 来描述。然后调用generateVerityTreeInternal()来生成output 中的数据值，并且得到根的摘要值。
最后将树的数据、树的大小、树根摘要值封装成VerityResult对象返回。

得到Merkle树从根到该层的数据大小

    private static int[] calculateVerityLevelOffset(long fileSize) {
    
    
        ArrayList<Long> levelSize = new ArrayList<>();
        while (true) {
    
    
            long levelDigestSize = divideRoundup(fileSize, CHUNK_SIZE_BYTES) * DIGEST_SIZE_BYTES;
            long chunksSize = CHUNK_SIZE_BYTES * divideRoundup(levelDigestSize, CHUNK_SIZE_BYTES);
            levelSize.add(chunksSize);
            if (levelDigestSize <= CHUNK_SIZE_BYTES) {
    
    
                break;
            }
            fileSize = levelDigestSize;
        }

        // Reverse and convert to summed area table.
        int[] levelOffset = new int[levelSize.size() + 1];
        levelOffset[0] = 0;
        for (int i = 0; i < levelSize.size(); i++) {
    
    
            // We don't support verity tree if it is larger then Integer.MAX_VALUE.
            levelOffset[i + 1] = levelOffset[i]
                    + Math.toIntExact(levelSize.get(levelSize.size() - i - 1));
        }
        return levelOffset;
    }

要理解这块代码，需要知道Merkle树的构造方式。
将APK文件除掉签名分块部分，每隔4096字节分成一块，最后一块不足补0。然后将每一块通过摘要算法计算得到32字节摘要值。这算Merkle树的最后一层。存在集合levelSize中的是32字节摘要的所有数据大小占据的4096字节块数的总和。
Merkle树的倒数第二层则是倒数第一层的摘要数据大小再按照4096字节分块，然后每块计算出来32字节摘要值。再将这些摘要值的大小占据的4096字节块数的总和放到集合levelSize中。
就这样直到摘要的大小小于等于4096字节了，循环结束。这样集合levelSize中的最后一个大小就是一个数据块的大小，4096字节。
接着的levelOffset 数组存储的就是从顶层到该层的所有节点的大小。这样我们就知道了levelSize数组的最后一个数据就是所有节点的大小，就是Merkle树的大小。

构建Merkle树，得到根的摘要值

它是由generateVerityTreeInternal()实现的：

    @NonNull
    private static byte[] generateVerityTreeInternal(@NonNull RandomAccessFile apk,
            @Nullable SignatureInfo signatureInfo, @Nullable byte[] salt,
            @NonNull int[] levelOffset, @NonNull ByteBuffer output)
            throws IOException, NoSuchAlgorithmException, DigestException {
    
    
        // 1. Digest the apk to generate the leaf level hashes.
        assertSigningBlockAlignedAndHasFullPages(signatureInfo);
        generateApkVerityDigestAtLeafLevel(apk, signatureInfo, salt, slice(output,
                    levelOffset[levelOffset.length - 2], levelOffset[levelOffset.length - 1]));

        // 2. Digest the lower level hashes bottom up.
        for (int level = levelOffset.length - 3; level >= 0; level--) {
    
    
            ByteBuffer inputBuffer = slice(output, levelOffset[level + 1], levelOffset[level + 2]);
            ByteBuffer outputBuffer = slice(output, levelOffset[level], levelOffset[level + 1]);

            DataSource source = new ByteBufferDataSource(inputBuffer);
            BufferedDigester digester = new BufferedDigester(salt, outputBuffer);
            consumeByChunk(digester, source, CHUNK_SIZE_BYTES);
            digester.assertEmptyBuffer();
            digester.fillUpLastOutputChunk();
        }

        // 3. Digest the first block (i.e. first level) to generate the root hash.
        byte[] rootHash = new byte[DIGEST_SIZE_BYTES];
        BufferedDigester digester = new BufferedDigester(salt, ByteBuffer.wrap(rootHash));
        digester.consume(slice(output, 0, CHUNK_SIZE_BYTES));
        digester.assertEmptyBuffer();
        return rootHash;
    }

首先调用generateApkVerityDigestAtLeafLevel(),生成叶子节点。output里面需要存储树的各个节点，叶子节点就是它里面位置从levelOffset[levelOffset.length - 2]开始，到levelOffset[levelOffset.length - 1]结束。levelOffset数组的意思，上面都说过了。
接着通过通过循环，将上层节点摘要数据，按照4096字节分组，然后再生层下一层节点的摘要数据。这样就把output中各个节点的数据赋值。Merkle树也就构建完成。
最后就是给树根做摘要算法，得到摘要。返回rootHash。

叶子节点构建

它是由generateApkVerityDigestAtLeafLevel()实现：


        // 2. Skip APK Signing Block and continue digesting, until the Central Directory offset
        // field in EoCD is reached.
        long eocdCdOffsetFieldPosition =
                signatureInfo.eocdOffset + ZIP_EOCD_CENTRAL_DIR_OFFSET_FIELD_OFFSET;
        consumeByChunk(digester,
                DataSource.create(apk.getFD(), signatureInfo.centralDirOffset,
                    eocdCdOffsetFieldPosition - signatureInfo.centralDirOffset),
                MMAP_REGION_SIZE_BYTES);

        // 3. Consume offset of Signing Block as an alternative EoCD.
        ByteBuffer alternativeCentralDirOffset = ByteBuffer.allocate(
                ZIP_EOCD_CENTRAL_DIR_OFFSET_FIELD_SIZE).order(ByteOrder.LITTLE_ENDIAN);
        alternativeCentralDirOffset.putInt(Math.toIntExact(signatureInfo.apkSigningBlockOffset));
        alternativeCentralDirOffset.flip();
        digester.consume(alternativeCentralDirOffset);

        // 4. Read from end of the Central Directory offset field in EoCD to the end of the file.
        long offsetAfterEocdCdOffsetField =
                eocdCdOffsetFieldPosition + ZIP_EOCD_CENTRAL_DIR_OFFSET_FIELD_SIZE;
        consumeByChunk(digester,
                DataSource.create(apk.getFD(), offsetAfterEocdCdOffsetField,
                    apk.length() - offsetAfterEocdCdOffsetField),
                MMAP_REGION_SIZE_BYTES);

        // 5. Pad 0s up to the nearest 4096-byte block before hashing.
        int lastIncompleteChunkSize = (int) (apk.length() % CHUNK_SIZE_BYTES);
        if (lastIncompleteChunkSize != 0) {
    
    
            digester.consume(ByteBuffer.allocate(CHUNK_SIZE_BYTES - lastIncompleteChunkSize));
        }
        digester.assertEmptyBuffer();

        // 6. Fill up the rest of buffer with 0s.
        digester.fillUpLastOutputChunk();
    }

首先生成一个BufferedDigester类对象，它接收一个盐和ByteBuffer对象初始化，它是一个缓存生成摘要帮助类。它会将收到的数据凑齐成4096字节块之后，生成一个摘要，将摘要放入ByteBuffer对象对应的位置中，相当于生成了节点。上面知道，output的开始位置为levelOffset[levelOffset.length - 2]，它就是叶子节点开始的位置。
因为APK中签名数据块不参与生成摘要，所以需要将它筛除。所以在第二个consumeByChunk()方法时，数据源的开始位置为signatureInfo.centralDirOffset，它就是中央目录开始的位置。
在中央目录结尾的数据块中，需要注意，它在偏移ZIP_EOCD_CENTRAL_DIR_OFFSET_FIELD_OFFSET位置处存放着中央目录的偏移位置，不过它是插入签名数据块之后的位置，但是参与计算的应该是未插入签名之前的位置偏移量，所以这块需要处理一下。代码中注释3. 位置就是处理这个的。
调用consumeByChunk()方法处理了zip条目数据块、中央目录、中央目录尾部数据之后，如果发现最后一个数据块位置不满4096时，需要将它补0，直到4096字节。这块处理，使用的是apk.length() % CHUNK_SIZE_BYTES，得到的最后一个数据块的数量。但是apk.length() 是APK的数据大小，参与计算的不包括签名数据块，不太明白他这个为啥没有减去签名数据块的大小。
最后digester.fillUpLastOutputChunk()就是将叶子子节点的最后一个4096字节块没占满的话，将它用0填充。
这块向output中填充节点数据，使用的都是consumeByChunk()方法，所以有必要读一下它的代码：

    private static void consumeByChunk(DataDigester digester, DataSource source, int chunkSize)
            throws IOException, DigestException {
    
    
        long inputRemaining = source.size();
        long inputOffset = 0;
        while (inputRemaining > 0) {
    
    
            int size = (int) Math.min(inputRemaining, chunkSize);
            source.feedIntoDataDigester(digester, inputOffset, size);
            inputOffset += size;
            inputRemaining -= size;
        }
    }

代码挺简单，就是从数据源中按块大小，调用数据源的feedIntoDataDigester()方法，feedIntoDataDigester()方法主要就是调用digester的comsume()方法，这里digester是BufferedDigester，所以看下它的实现：

        @Override
        public void consume(ByteBuffer buffer) throws DigestException {
    
    
            int offset = buffer.position();
            int remaining = buffer.remaining();
            while (remaining > 0) {
    
    
                int allowance = (int) Math.min(remaining, BUFFER_SIZE - mBytesDigestedSinceReset);
                // Optimization: set the buffer limit to avoid allocating a new ByteBuffer object.
                buffer.limit(buffer.position() + allowance);
                mMd.update(buffer);
                offset += allowance;
                remaining -= allowance;
                mBytesDigestedSinceReset += allowance;

                if (mBytesDigestedSinceReset == BUFFER_SIZE) {
    
    
                    mMd.digest(mDigestBuffer, 0, mDigestBuffer.length);
                    mOutput.put(mDigestBuffer);
                    // After digest, MessageDigest resets automatically, so no need to reset again.
                    if (mSalt != null) {
    
    
                        mMd.update(mSalt);
                    }
                    mBytesDigestedSinceReset = 0;
                }
            }
        }

BUFFER_SIZE 是4096，mBytesDigestedSinceReset是每次向mMd更新多少数据，在更新数据不够BUFFER_SIZE 时，是不会进行计算摘要的。等到更新的数据到达BUFFER_SIZE 时，mMd计算摘要，然后将它放入mOutput中。然后重新计算mBytesDigestedSinceReset 值，等待下次到达BUFFER_SIZE 数据时，再进行计算。这里BufferedDigester的成员mOutput就是存储Merkle树节点的ByteBuffer对象。
这样generateApkVerityDigestAtLeafLevel()执行完毕之后，就将叶子节点都填充完毕了。

总结

现在我们知道了APK完整性的校验方式。
一种是对应CONTENT_DIGEST_CHUNKED_SHA256(对应摘要算法"SHA-256")或CONTENT_DIGEST_CHUNKED_SHA512(对应摘要算法"SHA-512")算法的摘要验证，它是将参与摘要的数据分成1M字节大小，并用字节 0xa5 的串联、块的长度（采用小端字节序的 uint32 值，以字节数计）和块的内容进行计算摘要。顶级摘要通过字节 0x5a 的串联、块数（采用小端字节序的 uint32 值）以及块的摘要的串联（按照块在 APK 中显示的顺序）进行计算得到摘要，拿他和v2分块中的摘要进行比较，相等通过。
另外一种是对应CONTENT_DIGEST_VERITY_CHUNKED_SHA256(对应摘要算法"SHA-256")，它将构建Merkle树，它的分块大小是4096字节，并且得到树根的摘要值，拿它和v2分块中的摘要进行比对，如果一致通过。

Android APK文件完整性验证