JAVA8中的BitSet

在看Arraylist原码的时候发现在 removeIf方法中使用了BitSet类。

BitSet(位图)

官方解释如下

This class implements a vector of bits that grows as needed. Each component of the bit set has a boolean value. The bits of a BitSet are indexed by nonnegative integers. Individual indexed bits can be examined, set, or cleared. One BitSet may be used to modify the contents of another BitSet through logical AND, logical inclusive OR, and logical exclusive OR operations. 

By default, all bits in the set initially have the value false. 

Every bit set has a current size, which is the number of bits of space currently in use by the bit set. Note that the size is related to the implementation of a bit set, so it may change with implementation. The length of a bit set relates to logical length of a bit set and is defined independently of implementation. 

Unless otherwise noted, passing a null parameter to any of the methods in a BitSet will result in a NullPointerException. 

A BitSet is not safe for multithreaded use without external synchronization.

这个解释越看越晕,明明里面储存的是long,怎么又放的是boolean?
继续往下看。

    /*
     * BitSets are packed into arrays of "words."  Currently a word is
     * a long, which consists of 64 bits, requiring 6 address bits.
     * The choice of word size is determined purely by performance concerns.
     */
    private final static int ADDRESS_BITS_PER_WORD = 6;
    private final static int BITS_PER_WORD = 1 << ADDRESS_BITS_PER_WORD;
    private final static int BIT_INDEX_MASK = BITS_PER_WORD - 1;

基本单位是word,一个long数字(64位)就是一个word。
word地址的bit数是6(2^6=64)。
word的总bit数是64。

    /**
     * The internal field corresponding to the serialField "bits".
     */
    private long[] words;

真正用来储存信息的是一个long的数组。
数组中每一个word都可以用来记录64个数字是否存在。如果存在,就是1(也就是boolean的true),不存在就是0(false)。
64个1或者0组成了一个long。
也就是说只是借用了一个long的形式,实际是用2进制来按顺序表示一个正数是否存在。

bitset只适用于正数。

     /**
     * Creates a new bit set. All bits are initially {@code false}.
     */
    public BitSet() {
        initWords(BITS_PER_WORD);
        sizeIsSticky = false;
    }
    private void initWords(int nbits) {
        words = new long[wordIndex(nbits-1) + 1];
    }
    /**
     * Given a bit index, return word index containing it.
     */
    private static int wordIndex(int bitIndex) {
        return bitIndex >> ADDRESS_BITS_PER_WORD;
    }

不带参数的构造器。

默认传入的是64,在java中数列下标从0开始,所以减去1,得到63。
63右移6位,得到wordIndex,加上1之后就是默认的long数组长度。
(因为java中数列下标从0开始,所以最大下标 + 1 = length)

这里有一个细节需要注意,右移6位,也就是除以 64(26=64,舍弃余数)。

word long[0]
wordIndex 0
bitIndex 63,62,61…4,3,2,1,0
bitValue 0,0,0…0,0,0,0,0
 /**
     * Sets the bit at the specified index to {@code true}.
     *
     * @param  bitIndex a bit index
     * @throws IndexOutOfBoundsException if the specified index is negative
     * @since  JDK1.0
     */
    public void set(int bitIndex) {
        if (bitIndex < 0)
            throw new IndexOutOfBoundsException("bitIndex < 0: " + bitIndex);

        int wordIndex = wordIndex(bitIndex);
        expandTo(wordIndex);

        words[wordIndex] |= (1L << bitIndex); // Restores invariants

        checkInvariants();
    }

假设我现在想标记66这个数字,那么传入66.
1.使用wordIndex方法,得到1 ( 66>>6 = 1) 。
2.使用expandTo扩容。(初始容量的最大下标是0)
3.1<<66,得到66取模64之后的余数410(1002
4.将1002用或的方式放入下标为1的位置。(按位或,双方只要有一方为1结果为1,都为0结果为0)

运算结束之后结果为

word long[0] long[1]
wordIndex 0 1
bitIndex 63,62,61…4,3,2,1,0 127,126,125…68,67,66,65,64
bitValue 0,0,0…0,0,0,0,0 0,0,0…0,0,1,0,0
/**
     * Returns the value of the bit with the specified index. The value
     * is {@code true} if the bit with the index {@code bitIndex}
     * is currently set in this {@code BitSet}; otherwise, the result
     * is {@code false}.
     *
     * @param  bitIndex   the bit index
     * @return the value of the bit with the specified index
     * @throws IndexOutOfBoundsException if the specified index is negative
     */
    public boolean get(int bitIndex) {
        if (bitIndex < 0)
            throw new IndexOutOfBoundsException("bitIndex < 0: " + bitIndex);

        checkInvariants();

        int wordIndex = wordIndex(bitIndex);
        return (wordIndex < wordsInUse)
            && ((words[wordIndex] & (1L << bitIndex)) != 0);
    }

用get来判断66是否存在于bitset中:
1.wordIndex = 66 <<6 = 1
2.long数组中下标为1 的数 , 1L左移66之后的数,这两个数按位与之后的boolean值就是返回值。

总结

举个栗子,如果有50个数,分布在[0,63]中,用BitSite的话,一位long就够了,如果用map或者set或者list,用普通的标记方法来做,消耗的空间会多的多。

猜你喜欢

转载自blog.csdn.net/weixin_42498646/article/details/87371722