【Java】guava cache 源码

缓存按照进程可以分为本地缓存和非本地缓存。本地缓存会把数据存储在程序的进程内存中，比如这里的guava。非本地缓存通常由另一个进程维护缓存，与应用程序是分割的，会涉及网络io，比如redis。

redis这类缓存大家应该比较熟悉，这里不做探讨。

本地缓存使用起来更加轻量，且因为没有网络io，效率更快。但是缺点也很明显，缓存大小受限，否则会吃掉项目本身运行时的内存。如果我们的缓存大小不多，且没有特殊的要求，本地缓存是不错的选择。

guava是google的一个工具包，是以jar包方式开放的，我们的项目如果要使用直接继承jar包即可。guava中的一个重要功能就是localcache。使用方式暂且不说了，很多资料，这里主要看下关键源码。

guava cache的最重要的特征就是没有使用多线程，所以真的方便，真的轻量。

guava缓存过期策略是一种延迟删除策略，每次get的时候检查时间戳，来决定这次get的值。而redis这种是采用多线程方式，定期检查。

那么我们就从get方法开始看。

V get(K key, int hash, CacheLoader<? super K, V> loader) throws ExecutionException {
      checkNotNull(key);
      checkNotNull(loader);
      try {
        if (count != 0) { // read-volatile
          // don't call getLiveEntry, which would ignore loading values
          ReferenceEntry<K, V> e = getEntry(key, hash);
          if (e != null) {
            long now = map.ticker.read();
            V value = getLiveValue(e, now);
            if (value != null) {
              recordRead(e, now);
              statsCounter.recordHits(1);
              return scheduleRefresh(e, key, hash, value, now, loader);
            }
            ValueReference<K, V> valueReference = e.getValueReference();
            if (valueReference.isLoading()) {
              return waitForLoadingValue(e, key, valueReference);
            }
          }
        }

        // at this point e is either null or expired;
        return lockedGetOrLoad(key, hash, loader);
      } catch (ExecutionException ee) {
        Throwable cause = ee.getCause();
        if (cause instanceof Error) {
          throw new ExecutionError((Error) cause);
        } else if (cause instanceof RuntimeException) {
          throw new UncheckedExecutionException(cause);
        }
        throw ee;
      } finally {
        postReadCleanup();
      }
    }

guava缓存的get方法逻辑是，如果有，且不过期就返回，否则加载，存储，返回。这是一种最常见的get方式，实现方式就在上面的代码里，这个get是在segment类里，别忘了guava就是一个concurrentHashMap。

首先根据hash值取一个entry，再从entry里取值，这里最终通过getLiveValue函数实现，该函数会检测value是否过期。如果过期了或者没有值，就会走到函数的结束地方，调用lockedGetOrLoad方法load值。如果没有过期，那么接着会检测该值是否需要刷新，由函数scheduleRefresh实现。

   /**
     * Gets the value from an entry. Returns null if the entry is invalid, partially-collected,
     * loading, or expired.
     */
    V getLiveValue(ReferenceEntry<K, V> entry, long now) {
      if (entry.getKey() == null) {
        tryDrainReferenceQueues();
        return null;
      }
      V value = entry.getValueReference().get();
      if (value == null) {
        tryDrainReferenceQueues();
        return null;
      }

      if (map.isExpired(entry, now)) {
        tryExpireEntries(now);
        return null;
      }
      return value;
    }

这里通过调用isExpired函数检测是否过期：

  /** Returns true if the entry has expired. */
  boolean isExpired(ReferenceEntry<K, V> entry, long now) {
    checkNotNull(entry);
    if (expiresAfterAccess() && (now - entry.getAccessTime() >= expireAfterAccessNanos)) {
      return true;
    }
    if (expiresAfterWrite() && (now - entry.getWriteTime() >= expireAfterWriteNanos)) {
      return true;
    }
    return false;
  }

isExpired会根据创建cache时的配置来判定过期。

一旦认定过期了，就会调用下面的函数来处理过期：

    /** Cleanup expired entries when the lock is available. */
    void tryExpireEntries(long now) {
      if (tryLock()) {
        try {
          expireEntries(now);
        } finally {
          unlock();
          // don't call postWriteCleanup as we're in a read
        }
      }
    }

    @GuardedBy("this")
    void expireEntries(long now) {
      drainRecencyQueue();

      ReferenceEntry<K, V> e;
      while ((e = writeQueue.peek()) != null && map.isExpired(e, now)) {
        if (!removeEntry(e, e.getHash(), RemovalCause.EXPIRED)) {
          throw new AssertionError();
        }
      }
      while ((e = accessQueue.peek()) != null && map.isExpired(e, now)) {
        if (!removeEntry(e, e.getHash(), RemovalCause.EXPIRED)) {
          throw new AssertionError();
        }
      }
    }

cache会维护两个队列来记录操作cache的所有动作，access或者write。遍历一遍队列，淘汰掉过期的。如果过期了getLiveValue会返回null，否则返回相应的value。

所以get方法在getLiveValue调用完以后会检测是否为null。

如果不为null，说明没有过期，接下来就会判断是否需要刷新。

    V scheduleRefresh(
        ReferenceEntry<K, V> entry,
        K key,
        int hash,
        V oldValue,
        long now,
        CacheLoader<? super K, V> loader) {
      if (map.refreshes()
          && (now - entry.getWriteTime() > map.refreshNanos)
          && !entry.getValueReference().isLoading()) {
        V newValue = refresh(key, hash, loader, true);
        if (newValue != null) {
          return newValue;
        }
      }
      return oldValue;
    }

如果不在更新，且达到了刷新阈值，就会refresh。

    /**
     * Refreshes the value associated with {@code key}, unless another thread is already doing so.
     * Returns the newly refreshed value associated with {@code key} if it was refreshed inline, or
     * {@code null} if another thread is performing the refresh or if an error occurs during
     * refresh.
     */
    @NullableDecl
    V refresh(K key, int hash, CacheLoader<? super K, V> loader, boolean checkTime) {
      final LoadingValueReference<K, V> loadingValueReference =
          insertLoadingValueReference(key, hash, checkTime);
      if (loadingValueReference == null) {
        return null;
      }

      ListenableFuture<V> result = loadAsync(key, hash, loadingValueReference, loader);
      if (result.isDone()) {
        try {
          return Uninterruptibles.getUninterruptibly(result);
        } catch (Throwable t) {
          // don't let refresh exceptions propagate; error was already logged
        }
      }
      return null;
    }

    ListenableFuture<V> loadAsync(
        final K key,
        final int hash,
        final LoadingValueReference<K, V> loadingValueReference,
        CacheLoader<? super K, V> loader) {
      final ListenableFuture<V> loadingFuture = loadingValueReference.loadFuture(key, loader);
      loadingFuture.addListener(
          new Runnable() {
            @Override
            public void run() {
              try {
                getAndRecordStats(key, hash, loadingValueReference, loadingFuture);
              } catch (Throwable t) {
                logger.log(Level.WARNING, "Exception thrown during refresh", t);
                loadingValueReference.setException(t);
              }
            }
          },
          directExecutor());
      return loadingFuture;
    }

这里很关键的是loadingValueReference的创建。guava由于需要的支持引用级别的过期，所以自己封了reference。loadingRef也是其中之一。该ref的特征是isLoading方法返回true。

下面看该ref的创建：

    /**
     * Returns a newly inserted {@code LoadingValueReference}, or null if the live value reference
     * is already loading.
     */
    @NullableDecl
    LoadingValueReference<K, V> insertLoadingValueReference(
        final K key, final int hash, boolean checkTime) {
      ReferenceEntry<K, V> e = null;
      lock();
      try {
        long now = map.ticker.read();
        preWriteCleanup(now);

        AtomicReferenceArray<ReferenceEntry<K, V>> table = this.table;
        int index = hash & (table.length() - 1);
        ReferenceEntry<K, V> first = table.get(index);

        // Look for an existing entry.
        for (e = first; e != null; e = e.getNext()) {
          K entryKey = e.getKey();
          if (e.getHash() == hash
              && entryKey != null
              && map.keyEquivalence.equivalent(key, entryKey)) {
            // We found an existing entry.

            ValueReference<K, V> valueReference = e.getValueReference();
            if (valueReference.isLoading()
                || (checkTime && (now - e.getWriteTime() < map.refreshNanos))) {
              // refresh is a no-op if loading is pending
              // if checkTime, we want to check *after* acquiring the lock if refresh still needs
              // to be scheduled
              return null;
            }

            // continue returning old value while loading
            ++modCount;
            LoadingValueReference<K, V> loadingValueReference =
                new LoadingValueReference<>(valueReference);
            e.setValueReference(loadingValueReference);
            return loadingValueReference;
          }
        }

        ++modCount;
        LoadingValueReference<K, V> loadingValueReference = new LoadingValueReference<>();
        e = newEntry(key, hash, first);
        e.setValueReference(loadingValueReference);
        table.set(index, e);
        return loadingValueReference;
      } finally {
        unlock();
        postWriteCleanup();
      }
    }

先上锁，所以只有第一个refresh的线程才能进来，所做的就是查找到原来的entry，然后把它替换为这里的loadingEntry。结束以后，释放锁，第二个线程进来，检测到loading尾true，就会返回null。所以只有第一个线程有机会去refresh值。

然后会到refresh函数当中，所有loadingValueReference为null的都是没有第一个获取到锁的，所以会返回原值（脏读）。只有第一个获取到锁的会返回load的新值。

最后这个新值会被set到cache中，具体代码在loadAsync中，这里就不贴了。

所以，对于refresh，最重要的特征是：并发的许多线程检测到需要refresh，只有第一个会去load值，其余的会返回原值。

这就是entry的value没有过期的情况。之前说了如果过期了，会调用如下方法load值：

V lockedGetOrLoad(K key, int hash, CacheLoader<? super K, V> loader) throws ExecutionException {
      ReferenceEntry<K, V> e;
      ValueReference<K, V> valueReference = null;
      LoadingValueReference<K, V> loadingValueReference = null;
      boolean createNewEntry = true;

      lock();
      try {
        // re-read ticker once inside the lock
        long now = map.ticker.read();
        preWriteCleanup(now);

        int newCount = this.count - 1;
        AtomicReferenceArray<ReferenceEntry<K, V>> table = this.table;
        int index = hash & (table.length() - 1);
        ReferenceEntry<K, V> first = table.get(index);

        for (e = first; e != null; e = e.getNext()) {
          K entryKey = e.getKey();
          if (e.getHash() == hash
              && entryKey != null
              && map.keyEquivalence.equivalent(key, entryKey)) {
            valueReference = e.getValueReference();
            if (valueReference.isLoading()) {
              createNewEntry = false;
            } else {
              V value = valueReference.get();
              if (value == null) {
                enqueueNotification(
                    entryKey, hash, value, valueReference.getWeight(), RemovalCause.COLLECTED);
              } else if (map.isExpired(e, now)) {
                // This is a duplicate check, as preWriteCleanup already purged expired
                // entries, but let's accomodate an incorrect expiration queue.
                enqueueNotification(
                    entryKey, hash, value, valueReference.getWeight(), RemovalCause.EXPIRED);
              } else {
                recordLockedRead(e, now);
                statsCounter.recordHits(1);
                // we were concurrent with loading; don't consider refresh
                return value;
              }

              // immediately reuse invalid entries
              writeQueue.remove(e);
              accessQueue.remove(e);
              this.count = newCount; // write-volatile
            }
            break;
          }
        }

        if (createNewEntry) {
          loadingValueReference = new LoadingValueReference<>();

          if (e == null) {
            e = newEntry(key, hash, first);
            e.setValueReference(loadingValueReference);
            table.set(index, e);
          } else {
            e.setValueReference(loadingValueReference);
          }
        }
      } finally {
        unlock();
        postWriteCleanup();
      }

      if (createNewEntry) {
        try {
          // Synchronizes on the entry to allow failing fast when a recursive load is
          // detected. This may be circumvented when an entry is copied, but will fail fast most
          // of the time.
          synchronized (e) {
            return loadSync(key, hash, loadingValueReference, loader);
          }
        } finally {
          statsCounter.recordMisses(1);
        }
      } else {
        // The entry already exists. Wait for loading.
        return waitForLoadingValue(e, key, valueReference);
      }
    }

一进入该方法会获取锁，第一个获取锁的线程，可以成功创建loadingRef，之后调用loadSync去load值。后来的线程就会检测到loadingRef处于loading状态了，只能调用下面的方法等待：

    V waitForLoadingValue(ReferenceEntry<K, V> e, K key, ValueReference<K, V> valueReference)
        throws ExecutionException {
      if (!valueReference.isLoading()) {
        throw new AssertionError();
      }

      checkState(!Thread.holdsLock(e), "Recursive load of: %s", key);
      // don't consider expiration as we're concurrent with loading
      try {
        V value = valueReference.waitForValue();
        if (value == null) {
          throw new InvalidCacheLoadException("CacheLoader returned null for key " + key + ".");
        }
        // re-read ticker now that loading has completed
        long now = map.ticker.read();
        recordRead(e, now);
        return value;
      } finally {
        statsCounter.recordMisses(1);
      }
    }

这个等待是在waitForValue里实现的，也是loadingRef特有的方法。下面是loadingRef的实现：

static class LoadingValueReference<K, V> implements ValueReference<K, V> {
    volatile ValueReference<K, V> oldValue;

    // TODO(fry): rename get, then extend AbstractFuture instead of containing SettableFuture
    final SettableFuture<V> futureValue = SettableFuture.create();
    final Stopwatch stopwatch = Stopwatch.createUnstarted();

    public LoadingValueReference() {
      this(null);
    }

    public LoadingValueReference(ValueReference<K, V> oldValue) {
      this.oldValue = (oldValue == null) ? LocalCache.<K, V>unset() : oldValue;
    }

    @Override
    public boolean isLoading() {
      return true;
    }

    @Override
    public boolean isActive() {
      return oldValue.isActive();
    }

    @Override
    public int getWeight() {
      return oldValue.getWeight();
    }

    public boolean set(@NullableDecl V newValue) {
      return futureValue.set(newValue);
    }

    public boolean setException(Throwable t) {
      return futureValue.setException(t);
    }

    private ListenableFuture<V> fullyFailedFuture(Throwable t) {
      return Futures.immediateFailedFuture(t);
    }

    @Override
    public void notifyNewValue(@NullableDecl V newValue) {
      if (newValue != null) {
        // The pending load was clobbered by a manual write.
        // Unblock all pending gets, and have them return the new value.
        set(newValue);
      } else {
        // The pending load was removed. Delay notifications until loading completes.
        oldValue = unset();
      }

      // TODO(fry): could also cancel loading if we had a handle on its future
    }

    public ListenableFuture<V> loadFuture(K key, CacheLoader<? super K, V> loader) {
      try {
        stopwatch.start();
        V previousValue = oldValue.get();
        if (previousValue == null) {
          V newValue = loader.load(key);
          return set(newValue) ? futureValue : Futures.immediateFuture(newValue);
        }
        ListenableFuture<V> newValue = loader.reload(key, previousValue);
        if (newValue == null) {
          return Futures.immediateFuture(null);
        }
        // To avoid a race, make sure the refreshed value is set into loadingValueReference
        // *before* returning newValue from the cache query.
        return transform(
            newValue,
            new com.google.common.base.Function<V, V>() {
              @Override
              public V apply(V newValue) {
                LoadingValueReference.this.set(newValue);
                return newValue;
              }
            },
            directExecutor());
      } catch (Throwable t) {
        ListenableFuture<V> result = setException(t) ? futureValue : fullyFailedFuture(t);
        if (t instanceof InterruptedException) {
          Thread.currentThread().interrupt();
        }
        return result;
      }
    }

    public V compute(K key, BiFunction<? super K, ? super V, ? extends V> function) {
      stopwatch.start();
      V previousValue;
      try {
        previousValue = oldValue.waitForValue();
      } catch (ExecutionException e) {
        previousValue = null;
      }
      V newValue = function.apply(key, previousValue);
      this.set(newValue);
      return newValue;
    }

    public long elapsedNanos() {
      return stopwatch.elapsed(NANOSECONDS);
    }

    @Override
    public V waitForValue() throws ExecutionException {
      return getUninterruptibly(futureValue);
    }

    @Override
    public V get() {
      return oldValue.get();
    }

    public ValueReference<K, V> getOldValue() {
      return oldValue;
    }

    @Override
    public ReferenceEntry<K, V> getEntry() {
      return null;
    }

    @Override
    public ValueReference<K, V> copyFor(
        ReferenceQueue<V> queue, @NullableDecl V value, ReferenceEntry<K, V> entry) {
      return this;
    }
  }

所以过期load的逻辑是，多线程检测到过期，只有第一个线程可以load值，其余的线程只能阻塞在那里等待第一个线程load值。而refresh也是第一个线程可以load值，但是其他线程会返回原来的值。

具体使用时，我们可以当refresh周期小于expire周期，这样可以防止缓存雪崩。

【Java】guava cache 源码

猜你喜欢