ConcurrentHashMap解析-Segment

HashMap详解: http://donald-draper.iteye.com/blog/2361702
ConcurrentMap介绍: http://donald-draper.iteye.com/blog/2361719
HashMap是线程非安全的,Hashtable是线程安全的,并发访问支持较差,但已经过时,今天我们来看,并发包中的
线程安全且可并发访问的ConcurrentHashMap
package java.util.concurrent;
import java.util.concurrent.locks.*;
import java.util.*;
import java.io.Serializable;
import java.io.IOException;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;

/**
 * A hash table supporting full concurrency of retrievals and
 * adjustable expected concurrency for updates. This class obeys the
 * same functional specification as {@link java.util.Hashtable}, and
 * includes versions of methods corresponding to each method of
 * <tt>Hashtable</tt>. However, even though all operations are
 * thread-safe, retrieval operations do [i]not[/i] entail locking,
 * and there is [i]not[/i] any support for locking the entire table
 * in a way that prevents all access.  This class is fully
 * interoperable with <tt>Hashtable</tt> in programs that rely on its
 * thread safety but not on its synchronization details.
 *
ConcurrentHashMap提供完全线程安全的并发访问。ConcurrentHashMap与HashTable的
功能基本相同。即使所有操作是线程安全,检索操作不许要lock,也不支持lock the
entry table,那样会阻止所有访问。在编程的时候ConcurrentHashMap与HashTable是
协作的,使用哪一个依赖于是否线程安全,而不是,是否需要同步。

 * <p> Retrieval operations (including <tt>get</tt>) generally do not
 * block, so may overlap with update operations (including
 * <tt>put</tt> and <tt>remove</tt>). Retrievals reflect the results
 * of the most recently [i]completed[/i] update operations holding
 * upon their onset.  For aggregate operations such as <tt>putAll</tt>
 * and <tt>clear</tt>, concurrent retrievals may reflect insertion or
 * removal of only some entries.  Similarly, Iterators and
 * Enumerations return elements reflecting the state of the hash table
 * at some point at or since the creation of the iterator/enumeration.
 * They do [i]not[/i] throw {@link ConcurrentModificationException}.
 * However, iterators are designed to be used by only one thread at a time.
 *
检索操作(get等)一般不会阻塞,也许会与更新操作(put,remove等)出现overlap(重叠)。
检索操作反应的是最近更新操作完成的结果。进一步说,putAll和clear操作,并发检索也许
会反应插入或移除Entry的结果。相似的,Iterators和Enumerations返回的Entry,反应者
hash table在某一点,比如创建iterator/enumeration。它们并不会抛出异常,iterators
设计为只能有一个线程访问,在同一时间点。
 * <p> The allowed concurrency among update operations is guided by
 * the optional <tt>concurrencyLevel</tt> constructor argument
 * (default <tt>16</tt>), which is used as a hint for internal sizing.  The
 * table is internally partitioned to try to permit the indicated
 * number of concurrent updates without contention. Because placement
 * in hash tables is essentially random, the actual concurrency will
 * vary.  Ideally, you should choose a value to accommodate as many
 * threads as will ever concurrently modify the table. Using a
 * significantly higher value than you need can waste space and time,
 * and a significantly lower value can lead to thread contention. But
 * overestimates and underestimates within an order of magnitude do
 * not usually have much noticeable impact. A value of one is
 * appropriate when it is known that only one thread will modify and
 * all others will only read. Also, resizing this or any other kind of
 * hash table is a relatively slow operation, so, when possible, it is
 * a good idea to provide estimates of expected table sizes in
 * constructors.
 *
ConcurrentHashMap可以同时并发访问,并发数量与构造函数的concurrencyLevel有关,
默认为16,concurrencyLevel表示内部的hashtable的容量。内部的hash table是分块的,块
的数量表示无竞争的情况下,可以并发更新的数量。由于元素放到hash table中,是随机的,
实际的并发数,以实际为准。理想情况下,我们应该选择一个合适值,使更多的线程可以同时
修改hash table。用于个较高的值,可能会浪费不必要的空间和时间,太低的话,将导致
线程的竞争。当时在一个数量级的情况下过多或过少,没有太有的不同。一个近似的值为
当一个线程修改,可以有多少个线程可进行读操作。重新扩容或其他种类hash table是一个
较慢的操作,如果可能的话,在构造hash table的时候,最后给一个预估的size。
 * <p>This class and its views and iterators implement all of the
 * [i]optional[/i] methods of the {@link Map} and {@link Iterator}
 * interfaces.
 *
 ConcurrentHashMap实现所有Map接口中的视图和iterators。这有点像hash table,而不是
 HashMap,ConcurrentHashMap不允许key和value为null。
 * <p> Like {@link Hashtable} but unlike {@link HashMap}, this class
 * does [i]not[/i] allow <tt>null</tt> to be used as a key or value.
 *
 * <p>This class is a member of the
 * <a href="{@docRoot}/../technotes/guides/collections/index.html">
 * Java Collections Framework</a>.
 *
 * @since 1.5
 * @author Doug Lea
 * @param <K> the type of keys maintained by this map
 * @param <V> the type of mapped values
 */
public class ConcurrentHashMap<K, V> extends AbstractMap<K, V>
        implements ConcurrentMap<K, V>, Serializable {
    private static final long serialVersionUID = 7249069246763182397L;

    /*
     * The basic strategy is to subdivide the table among Segments,
     * each of which itself is a concurrently readable hash table.  To
     * reduce footprint, all but one segments are constructed only
     * when first needed (see ensureSegment). To maintain visibility
     * in the presence of lazy construction, accesses to segments as
     * well as elements of segment's table must use volatile access,
     * which is done via Unsafe within methods segmentAt etc
     * below. These provide the functionality of AtomicReferenceArrays
     * but reduce the levels of indirection. Additionally,
     * volatile-writes of table elements and entry "next" fields
     * within locked operations use the cheaper "lazySet" forms of
     * writes (via putOrderedObject) because these writes are always
     * followed by lock releases that maintain sequential consistency
     * of table updates.
     *
     ConcurrentHashMap最基本的策略是将table分布在不同的Segments,每个
    Segments都是一个并发的可读hash table。 To reduce footprint, all but 
     one segments are constructed only when first needed (see ensureSegment).
     (为了减少footprint,当需要的时候,我们才构造segments)。
     为了保证懒构造的可见性,访问segments和访问segments中table中elements一样,
     必须用volatile访问,实现的方式是通过Unsafe和segmentAt等。
     这个方式提供了AtomicReferenceArrays的功能,除了减少间接访问的次数。
     另外,volatile-writes table元素和修改Entry的next指针,通过putOrderedObject,
     使用比较简单的lazySet形式,因为这些写操作总是跟着lock的释放,以维持
     表更新的一致性。

     * Historical note: The previous version of this class relied
     * heavily on "final" fields, which avoided some volatile reads at
     * the expense of a large initial footprint.  Some remnants of
     * that design (including forced construction of segment 0) exist
     * to ensure serialization compatibility.
     */
     经验建议:ConcurrentHashMap的先前版本,过多的依赖于final,这避免了
     在大量初始化封装实体的情况,可见性的读操作。ConcurrentHashMap有一些
     保留性的设计,不如在构造是强制构造segment,以保证序列化的兼容性。
    /* ---------------- Constants -------------- */

     /**
     * The default initial capacity for this table,
     * used when not otherwise specified in a constructor.
     */
     table的默认容量
    static final int DEFAULT_INITIAL_CAPACITY = 16;

    /**
     * The default load factor for this table, used when not
     * otherwise specified in a constructor.
     */
     默认的负载因子
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

    /**
     * The default concurrency level for this table, used when not
     * otherwise specified in a constructor.
     */
     table的并发访问级别,在构造中非必须
    static final int DEFAULT_CONCURRENCY_LEVEL = 16;

    /**
     * The maximum capacity, used if a higher value is implicitly
     * specified by either of the constructors with arguments.  MUST
     * be a power of two <= 1<<30 to ensure that entries are indexable
     * using ints.
     */
     最大容量
    static final int MAXIMUM_CAPACITY = 1 << 30;

    /**
     * The minimum capacity for per-segment tables.  Must be a power
     * of two, at least two to avoid immediate resizing on next use
     * after lazy construction.
     */
     每个片段 table的最小容量,默认至少为2,避免当next指针使用时,需要立即扩容
    static final int MIN_SEGMENT_TABLE_CAPACITY = 2;

    /**
     * The maximum number of segments to allow; used to bound
     * constructor arguments. Must be power of two less than 1 << 24.
     */
     最大片段数
    static final int MAX_SEGMENTS = 1 << 16; // slightly conservative

    /**
     * Number of unsynchronized retries in size and containsValue
     * methods before resorting to locking. This is used to avoid
     * unbounded retries if tables undergo continuous modification
     * which would make it impossible to obtain an accurate result.
     */
     在重排序lock之前,非同步化尝试调用size和containsValue方法的次数。
     这个为了避免无限制的尝试,当table需要持续性的修改,这样做是为了
     尽可能的保证获取准确的结果
    static final int RETRIES_BEFORE_LOCK = 2;

    /* ---------------- Fields -------------- */

    /**
     * holds values which can't be initialized until after VM is booted.
     */
    private static class Holder {

        /**
        * Enable alternative hashing of String keys?
        *
        * <p>Unlike the other hash map implementations we do not implement a
        * threshold for regulating whether alternative hashing is used for
        * String keys. Alternative hashing is either enabled for all instances
        * or disabled for all instances.
        */
        static final boolean ALTERNATIVE_HASHING;

        static {
            // Use the "threshold" system property even though our threshold
            // behaviour is "ON" or "OFF".
            String altThreshold = java.security.AccessController.doPrivileged(
                new sun.security.action.GetPropertyAction(
                    "jdk.map.althashing.threshold"));

            int threshold;
            try {
                threshold = (null != altThreshold)
                        ? Integer.parseInt(altThreshold)
                        : Integer.MAX_VALUE;

                // disable alternative hashing if -1
                if (threshold == -1) {
                    threshold = Integer.MAX_VALUE;
                }

                if (threshold < 0) {
                    throw new IllegalArgumentException("value must be positive integer.");
                }
            } catch(IllegalArgumentException failed) {
                throw new Error("Illegal value for 'jdk.map.althashing.threshold'", failed);
            }
            ALTERNATIVE_HASHING = threshold <= MAXIMUM_CAPACITY;
        }
    }

    /**
     * A randomizing value associated with this instance that is applied to
     * hash code of keys to make hash collisions harder to find.
     */
     hash种子
    private transient final int hashSeed = randomHashSeed(this);
    //获取ConcurrentHashMap实例的随机种子
    private static int randomHashSeed(ConcurrentHashMap instance) {
        if (sun.misc.VM.isBooted() && Holder.ALTERNATIVE_HASHING) {
            return sun.misc.Hashing.randomHashSeed(instance);
        }

        return 0;
    }

    /**
     * Mask value for indexing into segments. The upper bits of a
     * key's hash code are used to choose the segment.
     */
     //片段索引的掩码,可以hash值的高位,用于选择片段索引
    final int segmentMask;

    /**
     * Shift value for indexing within segments.
     */
     索引在片段中的偏移量
    final int segmentShift;

    /**
     * The segments, each of which is a specialized hash table.
     */
     片段是一个特殊的Hash table;
    final Segment<K,V>[] segments;

    transient Set<K> keySet;//Key集合
    transient Set<Map.Entry<K,V>> entrySet;//Entry集合
    transient Collection<V> values;//vaule集合
}

先看一下HashEntry
/**
     * ConcurrentHashMap list entry. Note that this is never exported
     * out as a user-visible Map.Entry.
     */
     //HashEntry内部使用,对用户不可见
    static final class HashEntry<K,V> {
        final int hash;//hash值
        final K key;//key
	//key和hash值不可修改
        volatile V value;//value值和next内存可见
        volatile HashEntry<K,V> next;

        HashEntry(int hash, K key, V value, HashEntry<K,V> next) {
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;
        }

        /**
         * Sets next field with volatile write semantics.  (See above
         * about use of putOrderedObject.)
         */
	 //使用UNSAFE的putOrderedObject设置next指针
        final void setNext(HashEntry<K,V> n) {
            UNSAFE.putOrderedObject(this, nextOffset, n);
        }

        // Unsafe mechanics
        static final sun.misc.Unsafe UNSAFE;
        static final long nextOffset;
        static {
            try {
                UNSAFE = sun.misc.Unsafe.getUnsafe();
                Class k = HashEntry.class;
                nextOffset = UNSAFE.objectFieldOffset
                    (k.getDeclaredField("next"));
            } catch (Exception e) {
                throw new Error(e);
            }
        }
    }

再来看一下Segment定义
 
 /**
     * Segments are specialized versions of hash tables.  This
     * subclasses from ReentrantLock opportunistically, just to
     * simplify some locking and avoid separate construction.
     */
     //片段是一个特殊版本的hash table。为可重入锁ReentrantLock的子类,
     仅仅为了简化加锁和避免分离构造
    static final class Segment<K,V> extends ReentrantLock implements Serializable {
        /*
         * Segments maintain a table of entry lists that are always
         * kept in a consistent state, so can be read (via volatile
         * reads of segments and tables) without locking.  This
         * requires replicating nodes when necessary during table
         * resizing, so the old lists can be traversed by readers
         * still using old version of table.
         *
	Segments维护者一个Entry列表table,总是保持一致性状态,因此可以
	不通过锁,我们就可以读取segments and tables的最新值。在重新扩容的时候,
	旧的entry列表被迁移到新的segments and tables上,读线程,能可以用旧版本的table。
         * This class defines only mutative methods requiring locking.
         * Except as noted, the methods of this class perform the
         * per-segment versions of ConcurrentHashMap methods.  (Other
         * methods are integrated directly into ConcurrentHashMap
         * methods.) These mutative methods use a form of controlled
         * spinning on contention via methods scanAndLock and
         * scanAndLockForPut. These intersperse tryLocks with
         * traversals to locate nodes.  The main benefit is to absorb
         * cache misses (which are very common for hash tables) while
         * obtaining locks so that traversal is faster once
         * acquired. We do not actually use the found nodes since they
         * must be re-acquired under lock anyway to ensure sequential
         * consistency of updates (and in any case may be undetectably
         * stale), but they will normally be much faster to re-locate.
         * Also, scanAndLockForPut speculatively creates a fresh node
         * to use in put if no node is found.
         */
        Segments只会在修改hash table的方法中,是用lock,之所以加锁,是为了
	保证ConcurrentHashMap中每segment的一致性。而其他一些方法,则直接放在
	ConcurrentHashMap中。这些更新table的方式是通过scanAndLock和scanAndLockForPut
	方法,控制自旋竞争。tryLocks方法,只会在遍历table时,锁住NODE。
	这样做的好处是,在遍历table的情况,尽快获取locks,以保证缓存的可用性与可靠性。
	为了保证更新的一致性,在锁住的情况下,由于要重新获取锁,我们一般不会访问锁住的节点。
        但实际上的速度,要比重新定位要快。如果node不存在,则scanAndLockForPut会创建一个新的节点
	放到table中。

        private static final long serialVersionUID = 2249069246763182397L;

        /**
         * The maximum number of times to tryLock in a prescan before
         * possibly blocking on acquire in preparation for a locked
         * segment operation. On multiprocessors, using a bounded
         * number of retries maintains cache acquired while locating
         * nodes.
         */
	 最大尝试获取锁次数,tryLock可能会阻塞,准备锁住segment操作获取锁。
	 在多处理器中,用一个有界的尝试次数,保证在定位node的时候,可以从缓存直接获取。
        static final int MAX_SCAN_RETRIES =
            Runtime.getRuntime().availableProcessors() > 1 ? 64 : 1;

        /**
         * The per-segment table. Elements are accessed via
         * entryAt/setEntryAt providing volatile semantics.
         */
	segment内部的Hash table,访问HashEntry,通过具有volatile的entryAt/setEntryAt方法
        transient volatile HashEntry<K,V>[] table;

        /**
         * The number of elements. Accessed only either within locks
         * or among other volatile reads that maintain visibility.
         */
	 segment的table中HashEntry的数量,只有在lock或其他保证可见性的volatile reads
	 中,才可以访问count
        transient int count;

        /**
         * The total number of mutative operations in this segment.
         * Even though this may overflows 32 bits, it provides
         * sufficient accuracy for stability checks in CHM isEmpty()
         * and size() methods.  Accessed only either within locks or
         * among other volatile reads that maintain visibility.
         */
	在segment上所有的修改操作数。尽管可能会溢出,但它为isEmpty和size方法,
	提供了有效准确稳定的检查或校验。只有在lock或其他保证可见性的volatile reads
	 中,才可以访问
        transient int modCount;

        /**
         * The table is rehashed when its size exceeds this threshold.
         * (The value of this field is always <tt>(int)(capacity *
         * loadFactor)</tt>.)
         */
	table重新hash的临界条件,为(capacity * loadFactor)
        transient int threshold;

        /**
         * The load factor for the hash table.  Even though this value
         * is same for all segments, it is replicated to avoid needing
         * links to outer object.
         * @serial
         */
	 hash table的负载因子,尽管这个值是通过复制的,所有的segments相等,
	 为了避免需要连接到外部object
        final float loadFactor;
        //构造Segment,负载因子,临界条件,table
        Segment(float lf, int threshold, HashEntry<K,V>[] tab) {
            this.loadFactor = lf;
            this.threshold = threshold;
            this.table = tab;
        }
}

从上面来看:
Segment拥有与ConCurrentHashMap相同的负载因子,临界条件,拥有一个hash table。

来看Segment的put操作
//Segment
	/*如果key存在且onlyIfAbsent为false,则更新旧值,否则创建新hash Entry,添加到table中
        final V put(K key, int hash, V value, boolean onlyIfAbsent) {
	   //尝试获取锁,获取失败返回node为null,否则scanAndLockForPut
            HashEntry<K,V> node = tryLock() ? null :
                scanAndLockForPut(key, hash, value);
            V oldValue;
            try {
                HashEntry<K,V>[] tab = table;
		//获取table索引
                int index = (tab.length - 1) & hash;
		//获取table索引为index的第一个HashEntry
                HashEntry<K,V> first = entryAt(tab, index);
		//遍历table的index索引上的HashEntry链
                for (HashEntry<K,V> e = first;;) {
                    if (e != null) {
		        //如果HashEntry不为null
                        K k;
                        if ((k = e.key) == key ||
                            (e.hash == hash && key.equals(k))) {
                            oldValue = e.value;
                            if (!onlyIfAbsent) {
			        //如果存在key且hash相等,onlyIfAbsent为false,则更新旧值为value
                                e.value = value;
				//修改数+1
                                ++modCount;
                            }
                            break;
                        }
                        e = e.next;
                    }
                    else {
                        if (node != null)
			    //如果创建的节点不为null,则将node放在table的index索引对应的HashEntry链的头部
                            node.setNext(first);
                        else
			   //否创建新的HashEntry,放在链头,next指向链的原始头部。
                            node = new HashEntry<K,V>(hash, key, value, first);
                        int c = count + 1;
                        if (c > threshold && tab.length < MAXIMUM_CAPACITY)
			    //如果table的Hash Entry数量size大于临界条件,且小于最大容量,则重新hash
                            rehash(node);
                        else
			    //添加node到table的索引对应的链表
                            setEntryAt(tab, index, node);
                        ++modCount;
                        count = c;
                        oldValue = null;
                        break;
                    }
                }
            } finally {
                unlock();
            }
            return oldValue;
        }

Segment的put操,我们有一下几点要分析
final V put(K key, int hash, V value, boolean onlyIfAbsent)

1.
//尝试获取锁,获取失败返回node为null,否则scanAndLockForPut
 HashEntry<K,V> node = tryLock() ? null :
     scanAndLockForPut(key, hash, value);


2.
 HashEntry<K,V>[] tab = table;
//获取table索引
int index = (tab.length - 1) & hash;
//获取table索引为index的第一个HashEntry
HashEntry<K,V> first = entryAt(tab, index);


3.
 if (c > threshold && tab.length < MAXIMUM_CAPACITY)
     //如果table的Hash Entry数量size大于临界条件,且小于最大容量,则重新hash
     rehash(node);

4.
else
//添加node到table的索引对应的链表
   setEntryAt(tab, index, node);


我们一点一点的看,
1.
//尝试获取锁,获取失败返回node为null,否则scanAndLockForPut
 HashEntry<K,V> node = tryLock() ? null :
     scanAndLockForPut(key, hash, value);


//Segment
        /**
         * Scans for a node containing given key while trying to
         * acquire lock, creating and returning one if not found. Upon
         * return, guarantees that lock is held. UNlike in most
         * methods, calls to method equals are not screened(筛选): Since
         * traversal speed doesn't matter, we might as well help warm
         * up the associated code and accesses as well.
         * 
	 在尝试获取锁失败时,遍历HashEntry,确认key存不存在,如果不存在,则创建一个新Hash Entry,返回,
	 保证持有锁。
         * @return a new node if key not found, else null
         */
        private HashEntry<K,V> scanAndLockForPut(K key, int hash, V value) {
	   //根据Segment片段和hash值,返回对应的Hash Entry
            HashEntry<K,V> first = entryForHash(this, hash);
            HashEntry<K,V> e = first;
            HashEntry<K,V> node = null;
            int retries = -1; // negative while locating node,运行尝试的次数
	    //当尝试获取锁失败
            while (!tryLock()) {
                HashEntry<K,V> f; // to recheck first below
                if (retries < 0) {
		    //当尝试获取锁失败,且尝试次数小于0,首次尝试,如果Entry为null,则创建新节点
                    if (e == null) {
                        if (node == null) // speculatively create node
                            node = new HashEntry<K,V>(hash, key, value, null);
                        retries = 0;
                    }
                    else if (key.equals(e.key))
                        retries = 0;
                    else
                        e = e.next;
                }
                else if (++retries > MAX_SCAN_RETRIES) {
		   //如果尝试次数大于最大尝试次数,则锁住,跳出循环
                    lock();
                    break;
                }
                else if ((retries & 1) == 0 &&
                         (f = entryForHash(this, hash)) != first) {
                    e = first = f; // re-traverse if entry changed
                    retries = -1;
                }
            }
            return node;
        }

    /* Gets the table entry for the given segment and hash
     */
    //根据Segment片段和hash值,返回对应的Hash Entry
    @SuppressWarnings("unchecked")
    static final <K,V> HashEntry<K,V> entryForHash(Segment<K,V> seg, int h) {
        HashEntry<K,V>[] tab;
        return (seg == null || (tab = seg.table) == null) ? null :
            (HashEntry<K,V>) UNSAFE.getObjectVolatile
            (tab, ((long)(((tab.length - 1) & h)) << TSHIFT) + TBASE);
    }

scanAndLockForPut函数的作用主要是:
在尝试获取锁失败时,遍历HashEntry,确认key存不存在,如果不存在,
则创建一个新Hash Entry,返回,确保一直持有锁。
2.
HashEntry<K,V>[] tab = table;
//获取table索引
int index = (tab.length - 1) & hash;
//获取table索引为index的第一个HashEntry
HashEntry<K,V> first = entryAt(tab, index);



 
  /**
     * Gets the ith element of given table (if nonnull) with volatile
     * read semantics. Note: This is manually integrated into a few
     * performance-sensitive methods to reduce call overhead.
     */
    @SuppressWarnings("unchecked")
    //返回片段table中,索引i对应的HashEntry 链的第一个HashEntry
    static final <K,V> HashEntry<K,V> entryAt(HashEntry<K,V>[] tab, int i) {
        return (tab == null) ? null :
            (HashEntry<K,V>) UNSAFE.getObjectVolatile
            (tab, ((long)i << TSHIFT) + TBASE);
    }
3.
 if (c > threshold && tab.length < MAXIMUM_CAPACITY)
     //如果table的Hash Entry数量size大于临界条件,且小于最大容量,则重新hash
     rehash(node);
 /**
         * Doubles size of table and repacks entries, also adding the
         * given node to new table
         */
        @SuppressWarnings("unchecked")
        private void rehash(HashEntry<K,V> node) {
            /*
             * Reclassify nodes in each list to new table.  Because we
             * are using power-of-two expansion, the elements from
             * each bin must either stay at same index, or move with a
             * power of two offset. We eliminate unnecessary node
             * creation by catching cases where old nodes can be
             * reused because their next fields won't change.
             * Statistically, at the default threshold, only about
             * one-sixth of them need cloning when a table
             * doubles. The nodes they replace will be garbage
             * collectable as soon as they are no longer referenced by
             * any reader thread that may be in the midst of
             * concurrently traversing table. Entry accesses use plain
             * array indexing because they are followed by volatile
             * table write.
             */
            HashEntry<K,V>[] oldTable = table;
            int oldCapacity = oldTable.length;
	    //扩容容量为原来的2倍,重新计算临界条件
            int newCapacity = oldCapacity << 1;
            threshold = (int)(newCapacity * loadFactor);
            HashEntry<K,V>[] newTable =
                (HashEntry<K,V>[]) new HashEntry[newCapacity];
            int sizeMask = newCapacity - 1;
	    //遍历table,将HashEntry,放在新table的对应的索引HashEntry链上。
            for (int i = 0; i < oldCapacity ; i++) {
                HashEntry<K,V> e = oldTable[i];
                if (e != null) {
                    HashEntry<K,V> next = e.next;
		    //重新获取原始HashEntry在新table中的索引
                    int idx = e.hash & sizeMask;
                    if (next == null)   //  Single node on list
                        newTable[idx] = e;
                    else { // Reuse consecutive sequence at same slot
                        HashEntry<K,V> lastRun = e;
                        int lastIdx = idx;
                        for (HashEntry<K,V> last = next;
                             last != null;
                             last = last.next) {
                            int k = last.hash & sizeMask;
                            if (k != lastIdx) {
                                lastIdx = k;
                                lastRun = last;
                            }
                        }
                        newTable[lastIdx] = lastRun;
                        // Clone remaining nodes
                        for (HashEntry<K,V> p = e; p != lastRun; p = p.next) {
                            V v = p.value;
                            int h = p.hash;
                            int k = h & sizeMask;
                            HashEntry<K,V> n = newTable[k];
                            newTable[k] = new HashEntry<K,V>(h, p.key, v, n);
                        }
                    }
                }
            }
            int nodeIndex = node.hash & sizeMask; // add the new node
            node.setNext(newTable[nodeIndex]);
            newTable[nodeIndex] = node;
            table = newTable;
        }

4.
else
//添加node到table的索引对应的链表
   setEntryAt(tab, index, node);

 /**
     * Sets the ith element of given table, with volatile write
     * semantics. (See above about use of putOrderedObject.)
     */
     //将HashEntry添加到片段table中的索引i对应的HashEntry链中。
    static final <K,V> void setEntryAt(HashEntry<K,V>[] tab, int i,
                                       HashEntry<K,V> e) {
        UNSAFE.putOrderedObject(tab, ((long)i << TSHIFT) + TBASE, e);
    }

从分析上面4步,可以看出,Segment的put操作,首先尝试获取锁,如果获取锁失败,
则Key在片段hash table中的索引,遍历索引对应的Hash Entry链,如找不到key对应
HashEntry,则创建一个HashEntry,这都是在尝试次数小于最大尝试次数MAX_SCAN_RETRIES情况下,MAX_SCAN_RETRIES默认为2。这样做的目的是为确保,进行put操作时,仍持有锁。
然后定位key在片段table中的索引,并放在链头,如果实际size达到临界条件,则重新hash,
创建2倍原始容量的hash table,重新建立hash table。

再看Segment的remove操作
//Segment
        /**
         * Remove; match on key only if value null, else match both.
         */
	 从片段的table中移除key,value值相等的hashEntry
        final V remove(Object key, int hash, Object value) {
	     //尝试获取锁失败时,遍历table中key所在索引的HashEntry链表,主要为确保持有锁
            if (!tryLock())
                scanAndLock(key, hash);
            V oldValue = null;
            try {
                HashEntry<K,V>[] tab = table;
		//定位key在table中的hashEntry链表索引
                int index = (tab.length - 1) & hash;
                HashEntry<K,V> e = entryAt(tab, index);
                HashEntry<K,V> pred = null;
		//如果索引位置上HashEntry链表存在,则遍历链表,找到对应的HashEntry,则移除
                while (e != null) {
                    K k;
                    HashEntry<K,V> next = e.next;
                    if ((k = e.key) == key ||
                        (e.hash == hash && key.equals(k))) {
                        V v = e.value;
                        if (value == null || value == v || value.equals(v)) {
                            if (pred == null)
                                setEntryAt(tab, index, next);
                            else
                                pred.setNext(next);
                            ++modCount;
                            --count;
                            oldValue = v;
                        }
                        break;
                    }
                    pred = e;
                    e = next;
                }
            } finally {
                unlock();
            }
            return oldValue;
        }

remove方法中有一点要看,
 //尝试获取锁失败时,遍历table中key所在索引的HashEntry链表,主要为确保持有锁
            if (!tryLock())
                scanAndLock(key, hash);

来看scanAndLock
//Segment
        /**
         * Scans for a node containing the given key while trying to
         * acquire lock for a remove or replace operation. Upon
         * return, guarantees that lock is held.  Note that we must
         * lock even if the key is not found, to ensure sequential
         * consistency of updates.
         */
	//如果尝试获取锁失败,则遍历key在table索引链表,查看对应的HashEntry,是否存在,存在则移除
        private void scanAndLock(Object key, int hash) {
            // similar to but simpler than scanAndLockForPut
	    //获取key在片段table中的HashEntry链表
            HashEntry<K,V> first = entryForHash(this, hash);
            HashEntry<K,V> e = first;
            int retries = -1;
	    //尝试获取锁失败,则遍历HashEntry链表,如果尝试次数超过最大尝试次数,则lock
            while (!tryLock()) {
                HashEntry<K,V> f;
                if (retries < 0) {
                    if (e == null || key.equals(e.key))
                        retries = 0;
                    else
                        e = e.next;
                }
                else if (++retries > MAX_SCAN_RETRIES) {
                    lock();
                    break;
                }
                else if ((retries & 1) == 0 &&
                         (f = entryForHash(this, hash)) != first) {
                    e = first = f;
                    retries = -1;
                }
            }
        }

从上面来看
Segment的remove操作,首先尝试获取锁失败,则继续尝试获取锁,在获取锁的过程中,
定位key在片段table的HashEntry链表索引,遍历链表,如果找到对应的HashEntry,则移除,
如果尝试次数超过最大尝试次数,则lock,则遍历链表,找到对应的HashEntry,则移除。

再来看Segment的replace操作
//Segment
        final boolean replace(K key, int hash, V oldValue, V newValue) {
            if (!tryLock())
                scanAndLock(key, hash);
            boolean replaced = false;
            try {
                HashEntry<K,V> e;
                for (e = entryForHash(this, hash); e != null; e = e.next) {
                    K k;
                    if ((k = e.key) == key ||
                        (e.hash == hash && key.equals(k))) {
                        if (oldValue.equals(e.value)) {
                            e.value = newValue;
                            ++modCount;
                            replaced = true;
                        }
                        break;
                    }
                }
            } finally {
                unlock();
            }
            return replaced;
        }
//Segment
        final V replace(K key, int hash, V value) {
            if (!tryLock())
                scanAndLock(key, hash);
            V oldValue = null;
            try {
                HashEntry<K,V> e;
                for (e = entryForHash(this, hash); e != null; e = e.next) {
                    K k;
                    if ((k = e.key) == key ||
                        (e.hash == hash && key.equals(k))) {
                        oldValue = e.value;
                        e.value = value;
                        ++modCount;
                        break;
                    }
                }
            } finally {
                unlock();
            }
            return oldValue;
        }

从上面的replace(K key, int hash, V value)和 replace(K key, int hash, V oldValue, V newValue) 来看,与remove的基本思路相同,这里就不在说,唯一的区别是,当替换值时,修改计数要自增1。

再看Segment清除
//Segment
        final void clear() {
            lock();
            try {
                HashEntry<K,V>[] tab = table;
                for (int i = 0; i < tab.length ; i++)
		   //设置片段table的索引i的HashEntry链表为null
                    setEntryAt(tab, i, null);
                ++modCount;
                count = 0;
            } finally {
                unlock();
            }
        }

从上面可以看出:Clear锁住整个table。

再看ConcurrentHashMap的构造
/**
     * Creates a new, empty map with a default initial capacity (16),
     * load factor (0.75) and concurrencyLevel (16).
     */
    public ConcurrentHashMap() {
        this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR, DEFAULT_CONCURRENCY_LEVEL);
    }

    /**
     * Creates a new, empty map with the specified initial capacity,
     * and with default load factor (0.75) and concurrencyLevel (16).
     *
     * @param initialCapacity the initial capacity. The implementation
     * performs internal sizing to accommodate this many elements.
     * @throws IllegalArgumentException if the initial capacity of
     * elements is negative.
     */
    public ConcurrentHashMap(int initialCapacity) {
        this(initialCapacity, DEFAULT_LOAD_FACTOR, DEFAULT_CONCURRENCY_LEVEL);
    }
     /**
     * Creates a new, empty map with the specified initial capacity
     * and load factor and with the default concurrencyLevel (16).
     *
     * @param initialCapacity The implementation performs internal
     * sizing to accommodate this many elements.
     * @param loadFactor  the load factor threshold, used to control resizing.
     * Resizing may be performed when the average number of elements per
     * bin exceeds this threshold.
     * @throws IllegalArgumentException if the initial capacity of
     * elements is negative or the load factor is nonpositive
     *
     * @since 1.6
     */
    public ConcurrentHashMap(int initialCapacity, float loadFactor) {
        this(initialCapacity, loadFactor, DEFAULT_CONCURRENCY_LEVEL);
    }
     /**
     * Creates a new, empty map with the specified initial
     * capacity, load factor and concurrency level.
     *
     * @param initialCapacity the initial capacity. The implementation
     * performs internal sizing to accommodate this many elements.
     * @param loadFactor  the load factor threshold, used to control resizing.
     * Resizing may be performed when the average number of elements per
     * bin exceeds this threshold.
     * @param concurrencyLevel the estimated number of concurrently
     * updating threads. The implementation performs internal sizing
     * to try to accommodate this many threads.
     concurrencyLevel为并发更数
     * @throws IllegalArgumentException if the initial capacity is
     * negative or the load factor or concurrencyLevel are
     * nonpositive.
     */
    @SuppressWarnings("unchecked")
    public ConcurrentHashMap(int initialCapacity,
                             float loadFactor, int concurrencyLevel) {
	//参数值,异常,则抛出IllegalArgumentException
        if (!(loadFactor > 0) || initialCapacity < 0 || concurrencyLevel <= 0)
            throw new IllegalArgumentException();
        if (concurrencyLevel > MAX_SEGMENTS)
            concurrencyLevel = MAX_SEGMENTS;
        // Find power-of-two sizes best matching arguments
        int sshift = 0;
        int ssize = 1;
	//ConcurrentHashMap的片段数量
        while (ssize < concurrencyLevel) {
            ++sshift;
            ssize <<= 1;
        }
        this.segmentShift = 32 - sshift;//片段偏移量
        this.segmentMask = ssize - 1;//片段掩码
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
	//计算片段中Hash table的容量
        int c = initialCapacity / ssize;
        if (c * ssize < initialCapacity)
            ++c;
        int cap = MIN_SEGMENT_TABLE_CAPACITY;
        while (cap < c)
            cap <<= 1;
        // create segments and segments[0]
	//创建0片段
        Segment<K,V> s0 =
            new Segment<K,V>(loadFactor, (int)(cap * loadFactor),
                             (HashEntry<K,V>[])new HashEntry[cap]);
	//创建片段数组
        Segment<K,V>[] ss = (Segment<K,V>[])new Segment[ssize];
        UNSAFE.putOrderedObject(ss, SBASE, s0); // ordered write of segments[0]
        this.segments = ss
;
    }
ConcurrentHashMap的构造主要是计算片段偏移量,片段掩码,临界条件,创建0片段和片段数组;片段数组中,片段的容量,负载因子,临界条件,并发访问量为默认值。
总结:
ConcurrentHashMap是线程安全的,可并发访问,不允许key或value的值为null,默认的容量为16,负载因子为0.75,并发访问量为16。ConcurrentHashMap中有一个Segment数组,默认数组大小为16,Segment中有一个HashEntry数组类似于HashMap中的table,Segment继承了可重入锁ReentrantLock,Segment的修改其hash table的操作都要使用lock。Segment的put操作,首先尝试获取锁,如果获取锁失败,则Key在片段hash table中的索引,遍历索引对应的Hash Entry链,如找不到key对应HashEntry,则创建一个HashEntry,这都是在尝试次数小于最大尝试次数MAX_SCAN_RETRIES情况下,MAX_SCAN_RETRIES默认为2。这样做的目的是为确保,进行put操作时,仍持有锁。然后定位key在片段table中的索引,并放在链头,如果实际size达到临界条件,则重新hash,创建2倍原始容量的hash table,重新建立hash table。Segment的remove操作,首先尝试获取锁失败,则继续尝试获取锁,在获取锁的过程中,定位key在片段table的HashEntry链表索引,遍历链表,如果找到对应HashEntry,则移除,如果尝试次数超过最大尝试次数,则lock,则遍历链表,找到对应的HashEntry,则移除。replace与remove的基本思路相同,唯一的区别是,当替换值时,修改计数要自增1。
put,remove和replace操作是锁住片段table中,key对应的索引HashEntry链表,而Clear为锁住整个table。ConcurrentHashMap通过将所有HashEntry分散在不同的Segment,及锁机制实现了并发访问。

ConcurrentHashMap的剩下部分,我们下一篇再讲。
ConcurrentHashMap解析后续: http://donald-draper.iteye.com/blog/2363201



猜你喜欢

转载自donald-draper.iteye.com/blog/2363200