Thread-Safe Collections

The "java.util.concurrent" package supplies efficient implementations for maps, sorted sets, and queues: "ConcurrentHashMap", "ConcurrentSkipListMap", "ConcurrentSkipListSet", and "ConcurrentLinkedQueue".

Unlike most collections, the "size" method of these classes does not necessarily operate in constant time. Determining the current size of one of these collections usually requires traversal.

NOTE: Some applications use humongous concurrent hash maps, so large that the "size" method is insufficient because it returns "int". What is one to do with a map that has over billion entries? Java SE 8 introduces a "mappingCount" method that returns the size as a "long".

The collections return weakly consistent iterators. That means the iterators may or may not reflect all modifications that are made after they will constructed, but they will not return a value twice and they will not throw a "ConcurrentModificationException".

The concurrent hash map can efficiently support a large number of readers and a fixed number of wirters. By default, it is assumed that there are up to 16 simultaneous writer threads. There can be many more writer threads, but if more than 16 write at the same time, the others are temporarily blocked. You can specify a higher number in the constructor, but it is unlikely that you will need to.

Atomic Update of Map Entries

Suppose we want to count how often certain features are observed. As a simple exaple, suppose multiple threads encounter words, and we want to count their frequencies. Can we use a "ConcurrentHashMap<String, Long>" ? Consider the code for incrementing a count. Obviously, the following is not thread safe:

Long oldValue = map.get(word);
Long newValue = oldValue == null ? 1 : oldValue + 1;
map.put(word, newValue);  // Error -- might not replace oldValue.

Another thread may be updating the exact same count at the same time. A classic trick is to use the "replace" operation, which atomically replaces an old value with a new one, provided that no other thread has come before and replaced the old value with something else. You have to keep doing it until "replace" succeeds:

do {
    oldValue = map.get(word);
    newValue = oldValue == null ? 1 : oldValue + 1;
} while (!map.replace(word, oldValue, newValue));

Alternatively, you can use a "ConcurrentHashMap<String, AtomicLong>" or, with Java SE 8, a "ConcurrentHashMap<String, LongAdder>". Then the update code is:

map.putIfAbsent(word, new LongAdder()).increment();

NOTE: Some programers are suprised that a supposedly thread-safe data structure permits operations that are not thread-safe. But there are two entirely different considerations. If multiple thread modify a plain "HashMap", they can destroy the internal data structure (an array of linked lists). Some of the links may go missing, or even go in circles, rendering the data structure unusable. That will never happend with a "ConcurrentHahMap". In the example above, the code of "get" and "put" will never corrupt the data structure. But, since the sequence of operations is not atomic, the result is not predicable.

Java SE 8 provides methods that make atomic updates more convenient. The "compute" method is called with a key and a function to compute the new value. For example, here is how we can update a map of integer counters:

map.compute(word, (k, v) -> v == null ? 1 : v + 1);

NOTE: You cannot have "null" values in a "ConcurrentHashMap". There are many methods that use a "null" value as an indication that a given key is not present in the map.

There are also variants "computeIfPresent" and "computIfAbsent" that only compute a new value when there is already an old value, or when there isn't yet one. A map of "LongAdder" counters can be updated with:

map.computeIfAbsent(word, k -> new LongAdder()).increment();

That is almostly like the call to "putIfAbsent" that you saw before, but the "LongAdder" constructor is only called when a new counter is actually needed.

You often need to do something special when a key is added for the first time. The "merge" method makes this particularly convenient. It has a parameter for the initial value that is used when the key is not yet present. Otherwise, the function that you supplied is called, combining the existing value and the initial value. (Unlike "compute", the function does not process the key.)

map.merge(word, 1L, (existingValue, initialValue) -> existingValue + initialValue);

or, more concisely:

map.merge(word, 1L, Long::sum);

NOTE: If the function that is passed to "comput" or "merge" returns "null", the existing entry is removed from the map.

CAUTION: When you use "compute" or "merge", keep in mind that the function that you supply should not do much work. While that function runs, some other updates to the map may be blocked. Of course, that function should also not update other parts of the map.

Bulk Operations on Concurrent Hash Maps

Java SE 8 provides bulk operations on concurrent hash maps that can safely execute even while other threads operate on the map. The bulk operations traverse the map and operate on the elements they find as they go along. No effort is made to freeze a snapshort of the map in time. Unless you happen to know that the map is not been modified while the bulk operation runs, you should treat its result as an approximation of the map's state. There are three kinds of operations:

"search" applies a function to each key and/or value, until the function yields a non-null result. Then the search terminates and the function result returned.

"reduce" combines all key and/or values, using a provided accumulation function.

"forEach" applies a function to all keys and/or values.

Each operation has four versions:

"operationKeys": operates on keys (such as: searchKeys, forEachKeys, reduceKeys)

"operationValues": operates on values (such as: searchValues, reduceValues, forEachValues)

"operation": operates on keys and values

"operationEntries": operates on map entries (such as: searchEntries, reduceEntries, forEachEntries)

With each of the operations, you need to specify a "parallelism threshold". If the map contains more elements than the "threshold", the bulk operation is parallelized. If you want the bulk operation to run in a single thread, use a threshole of "Long.MAX_VALUE". If you want the maximum number of threads to be made available for the bulk operation, use a threshold of 1 .

Let's look at the "search" methods first. Here are the versions:

U searchKeys(long threshold, BiFunction<? super K, ? extends U> f)
U searchValues(long threshold, BiFunction<? super V, ? extends U> f)
U search(long threshold, BiFunction<? super K, ? super V, ? extends U> f)
U searchEntries(long threshold, BiFunction<MapEntry<K, V>, ? extends U> f)

For example, suppose we want to find the first word that occurs more than 1000 times. We need to search keys and values:

String result = map.search(threshole, (k,v) -> v > 1000 ? k : null);

Then "result" is set to the first match, or to "null" if the search function returns "null" for all inputs.

The "foreEach" method has two variants. The first one simply applies a "consumer" function for each map entry, for example:

map.forEach(threshold, (k, v) -> System.out.println(k + " -> " + v));

The second variant takes an additional "transformer" function, which is applied first, and its result is passed to the consumer:

map.forEach(threshold,
    (k, v) -> k + " -> " + v,   // Transformer
    System.out::println);  // Consumer

The transformer can be used as a filter. Whenever the transformer returns "null", the value is silently skipped. For example, here we only print the entries with large values:

map.forEach(threshold, 
    (k, v) -> v > 1000 ? v : null, // Filter and Transformer
    System.out::println);  // The "null"s are not passed to the consumer

The "reduce" operations combine their inputs with an accumulation function. For example, here is how you can compute the sum of all values:

Long sum = map.reduceValues(threshold, Long::sum);

As with "forEach", you can also supply a transformer function. Here we compute the length of the longest key:

Integer maxLength = map.reduceKeys(threshold, String::length, Integer::max);  // It can automatically compute the longest length.

The transformer can act as a filter, by returning "null" to exclude unwanted inputs. Here, we count how many entries have value > 1000: (This usage of "reduce" is so cool)

map.reduceValue(threshold, v -> v.length  > 1000 ? 1L : null, Long::sum);

NOTE: If the map is empty, or all entries have been filtered out, the "reduce" operation returns "null". If there is only one element, its transformation is returned, and the accumulator is not applied.

There are specializations for "int", "long", and "double" outputs with suffixes "ToInt", "ToLong", and "ToDouble". You need to transform the input to a primitive value and specify a default value and an accumulator function. The default value is returned when the map is empty.

long sum = map.reduceValueToLong(threshold,
    Long::longValue,  // Transformer to primitive type
    0, // Default value for empty map
    Long::sum);  // Primitive type accumuator
CAUTION: These specializaitons act differently from the object versions where there is only one element to be considered. Instead of returning the transformed element, it is accumulated with the default. Therefore, the default must be the neutral element of the accumulator.


Concurrent Set Views

The static "newKeySet" method yields a "Set<K>" that is actually a wrapper around a "ConcurrentHashMap<K, Boolean>". (All map values a "Boolean.TRUE", but you don't actually care since you just use it as a set.)

Set<String> words = ConcurrentHashMap.<String>newKeySet();

Of course, if you have an existing map, the "keySet" method yields the set of keys. That set is mutable. If you remove the set's elements, the keys (and the values) are removed from the map. But it doesn't make sense to add elements to the key set, because there would be no corresponding values to add. Java SE 8 adds a second "keySet" method to "ConcurrentHashMap", with a default value, to be used when adding elements to the set:

Set<String> words = map.keySet(1L);
words.add("Java");

If "Java" wasn't already present in "words", it now has a value of one. (in other words, 1L).

Copy on Write Arrays

The "CopyOnWriteArrayList" and "CopyOnWriteArraySet" are thread-safe collections in which all mutators make a copy of the underlying array. This arrangement is useful if the threads that iterate the collection greatly outnumber the threads that mutate it. When you construct an iterator, it contains a reference to the current array. If the array is mutated later, the iterator still has the old array, but the collection's array is replaced. As a consequence, the older iterator has a consistent (but potentially outday) view that it can access without any synchronization expense.


猜你喜欢

转载自blog.csdn.net/liangking81/article/details/80790085