缓存按照进程可以分为本地缓存和非本地缓存。本地缓存会把数据存储在程序的进程内存中,比如这里的guava。非本地缓存通常由另一个进程维护缓存,与应用程序是分割的,会涉及网络io,比如redis。
redis这类缓存大家应该比较熟悉,这里不做探讨。
本地缓存使用起来更加轻量,且因为没有网络io,效率更快。但是缺点也很明显,缓存大小受限,否则会吃掉项目本身运行时的内存。如果我们的缓存大小不多,且没有特殊的要求,本地缓存是不错的选择。
guava是google的一个工具包,是以jar包方式开放的,我们的项目如果要使用直接继承jar包即可。guava中的一个重要功能就是localcache。使用方式暂且不说了,很多资料,这里主要看下关键源码。
guava cache的最重要的特征就是 没有使用多线程,所以真的方便,真的轻量。
guava缓存过期策略是一种延迟删除策略,每次get的时候检查时间戳,来决定这次get的值。而redis这种是采用多线程方式,定期检查。
那么我们就从get方法开始看。
V get(K key, int hash, CacheLoader<? super K, V> loader) throws ExecutionException {
checkNotNull(key);
checkNotNull(loader);
try {
if (count != 0) { // read-volatile
// don't call getLiveEntry, which would ignore loading values
ReferenceEntry<K, V> e = getEntry(key, hash);
if (e != null) {
long now = map.ticker.read();
V value = getLiveValue(e, now);
if (value != null) {
recordRead(e, now);
statsCounter.recordHits(1);
return scheduleRefresh(e, key, hash, value, now, loader);
}
ValueReference<K, V> valueReference = e.getValueReference();
if (valueReference.isLoading()) {
return waitForLoadingValue(e, key, valueReference);
}
}
}
// at this point e is either null or expired;
return lockedGetOrLoad(key, hash, loader);
} catch (ExecutionException ee) {
Throwable cause = ee.getCause();
if (cause instanceof Error) {
throw new ExecutionError((Error) cause);
} else if (cause instanceof RuntimeException) {
throw new UncheckedExecutionException(cause);
}
throw ee;
} finally {
postReadCleanup();
}
}
guava缓存的get方法逻辑是,如果有,且不过期就返回,否则加载,存储,返回。这是一种最常见的get方式,实现方式就在上面的代码里,这个get是在segment类里,别忘了guava就是一个concurrentHashMap。
首先根据hash值取一个entry,再从entry里取值,这里最终通过getLiveValue函数实现,该函数会检测value是否过期。如果过期了或者没有值,就会走到函数的结束地方,调用lockedGetOrLoad方法load值。如果没有过期,那么接着会检测该值是否需要刷新,由函数scheduleRefresh实现。
/**
* Gets the value from an entry. Returns null if the entry is invalid, partially-collected,
* loading, or expired.
*/
V getLiveValue(ReferenceEntry<K, V> entry, long now) {
if (entry.getKey() == null) {
tryDrainReferenceQueues();
return null;
}
V value = entry.getValueReference().get();
if (value == null) {
tryDrainReferenceQueues();
return null;
}
if (map.isExpired(entry, now)) {
tryExpireEntries(now);
return null;
}
return value;
}
这里通过调用isExpired函数检测是否过期:
/** Returns true if the entry has expired. */
boolean isExpired(ReferenceEntry<K, V> entry, long now) {
checkNotNull(entry);
if (expiresAfterAccess() && (now - entry.getAccessTime() >= expireAfterAccessNanos)) {
return true;
}
if (expiresAfterWrite() && (now - entry.getWriteTime() >= expireAfterWriteNanos)) {
return true;
}
return false;
}
isExpired会根据创建cache时的配置来判定过期。
一旦认定过期了,就会调用下面的函数来处理过期:
/** Cleanup expired entries when the lock is available. */
void tryExpireEntries(long now) {
if (tryLock()) {
try {
expireEntries(now);
} finally {
unlock();
// don't call postWriteCleanup as we're in a read
}
}
}
@GuardedBy("this")
void expireEntries(long now) {
drainRecencyQueue();
ReferenceEntry<K, V> e;
while ((e = writeQueue.peek()) != null && map.isExpired(e, now)) {
if (!removeEntry(e, e.getHash(), RemovalCause.EXPIRED)) {
throw new AssertionError();
}
}
while ((e = accessQueue.peek()) != null && map.isExpired(e, now)) {
if (!removeEntry(e, e.getHash(), RemovalCause.EXPIRED)) {
throw new AssertionError();
}
}
}
cache会维护两个队列来记录操作cache的所有动作,access或者write。遍历一遍队列,淘汰掉过期的。如果过期了getLiveValue会返回null,否则返回相应的value。
所以get方法在getLiveValue调用完以后会检测是否为null。
如果不为null,说明没有过期,接下来就会判断是否需要刷新。
V scheduleRefresh(
ReferenceEntry<K, V> entry,
K key,
int hash,
V oldValue,
long now,
CacheLoader<? super K, V> loader) {
if (map.refreshes()
&& (now - entry.getWriteTime() > map.refreshNanos)
&& !entry.getValueReference().isLoading()) {
V newValue = refresh(key, hash, loader, true);
if (newValue != null) {
return newValue;
}
}
return oldValue;
}
如果不在更新,且达到了刷新阈值,就会refresh。
/**
* Refreshes the value associated with {@code key}, unless another thread is already doing so.
* Returns the newly refreshed value associated with {@code key} if it was refreshed inline, or
* {@code null} if another thread is performing the refresh or if an error occurs during
* refresh.
*/
@NullableDecl
V refresh(K key, int hash, CacheLoader<? super K, V> loader, boolean checkTime) {
final LoadingValueReference<K, V> loadingValueReference =
insertLoadingValueReference(key, hash, checkTime);
if (loadingValueReference == null) {
return null;
}
ListenableFuture<V> result = loadAsync(key, hash, loadingValueReference, loader);
if (result.isDone()) {
try {
return Uninterruptibles.getUninterruptibly(result);
} catch (Throwable t) {
// don't let refresh exceptions propagate; error was already logged
}
}
return null;
}
ListenableFuture<V> loadAsync(
final K key,
final int hash,
final LoadingValueReference<K, V> loadingValueReference,
CacheLoader<? super K, V> loader) {
final ListenableFuture<V> loadingFuture = loadingValueReference.loadFuture(key, loader);
loadingFuture.addListener(
new Runnable() {
@Override
public void run() {
try {
getAndRecordStats(key, hash, loadingValueReference, loadingFuture);
} catch (Throwable t) {
logger.log(Level.WARNING, "Exception thrown during refresh", t);
loadingValueReference.setException(t);
}
}
},
directExecutor());
return loadingFuture;
}
这里很关键的是loadingValueReference的创建。guava由于需要的支持引用级别的过期,所以自己封了reference。loadingRef也是其中之一。该ref的特征是isLoading方法返回true。
下面看该ref的创建:
/**
* Returns a newly inserted {@code LoadingValueReference}, or null if the live value reference
* is already loading.
*/
@NullableDecl
LoadingValueReference<K, V> insertLoadingValueReference(
final K key, final int hash, boolean checkTime) {
ReferenceEntry<K, V> e = null;
lock();
try {
long now = map.ticker.read();
preWriteCleanup(now);
AtomicReferenceArray<ReferenceEntry<K, V>> table = this.table;
int index = hash & (table.length() - 1);
ReferenceEntry<K, V> first = table.get(index);
// Look for an existing entry.
for (e = first; e != null; e = e.getNext()) {
K entryKey = e.getKey();
if (e.getHash() == hash
&& entryKey != null
&& map.keyEquivalence.equivalent(key, entryKey)) {
// We found an existing entry.
ValueReference<K, V> valueReference = e.getValueReference();
if (valueReference.isLoading()
|| (checkTime && (now - e.getWriteTime() < map.refreshNanos))) {
// refresh is a no-op if loading is pending
// if checkTime, we want to check *after* acquiring the lock if refresh still needs
// to be scheduled
return null;
}
// continue returning old value while loading
++modCount;
LoadingValueReference<K, V> loadingValueReference =
new LoadingValueReference<>(valueReference);
e.setValueReference(loadingValueReference);
return loadingValueReference;
}
}
++modCount;
LoadingValueReference<K, V> loadingValueReference = new LoadingValueReference<>();
e = newEntry(key, hash, first);
e.setValueReference(loadingValueReference);
table.set(index, e);
return loadingValueReference;
} finally {
unlock();
postWriteCleanup();
}
}
先上锁,所以只有第一个refresh的线程才能进来,所做的就是查找到原来的entry,然后把它替换为这里的loadingEntry。结束以后,释放锁,第二个线程进来,检测到loading尾true,就会返回null。所以只有第一个线程有机会去refresh值。
然后会到refresh函数当中,所有loadingValueReference为null的都是没有第一个获取到锁的,所以会返回原值(脏读)。只有第一个获取到锁的会返回load的新值。
最后这个新值会被set到cache中,具体代码在loadAsync中,这里就不贴了。
所以,对于refresh,最重要的特征是:并发的许多线程检测到需要refresh,只有第一个会去load值,其余的会返回原值。
这就是entry的value没有过期的情况。之前说了如果过期了,会调用如下方法load值:
V lockedGetOrLoad(K key, int hash, CacheLoader<? super K, V> loader) throws ExecutionException {
ReferenceEntry<K, V> e;
ValueReference<K, V> valueReference = null;
LoadingValueReference<K, V> loadingValueReference = null;
boolean createNewEntry = true;
lock();
try {
// re-read ticker once inside the lock
long now = map.ticker.read();
preWriteCleanup(now);
int newCount = this.count - 1;
AtomicReferenceArray<ReferenceEntry<K, V>> table = this.table;
int index = hash & (table.length() - 1);
ReferenceEntry<K, V> first = table.get(index);
for (e = first; e != null; e = e.getNext()) {
K entryKey = e.getKey();
if (e.getHash() == hash
&& entryKey != null
&& map.keyEquivalence.equivalent(key, entryKey)) {
valueReference = e.getValueReference();
if (valueReference.isLoading()) {
createNewEntry = false;
} else {
V value = valueReference.get();
if (value == null) {
enqueueNotification(
entryKey, hash, value, valueReference.getWeight(), RemovalCause.COLLECTED);
} else if (map.isExpired(e, now)) {
// This is a duplicate check, as preWriteCleanup already purged expired
// entries, but let's accomodate an incorrect expiration queue.
enqueueNotification(
entryKey, hash, value, valueReference.getWeight(), RemovalCause.EXPIRED);
} else {
recordLockedRead(e, now);
statsCounter.recordHits(1);
// we were concurrent with loading; don't consider refresh
return value;
}
// immediately reuse invalid entries
writeQueue.remove(e);
accessQueue.remove(e);
this.count = newCount; // write-volatile
}
break;
}
}
if (createNewEntry) {
loadingValueReference = new LoadingValueReference<>();
if (e == null) {
e = newEntry(key, hash, first);
e.setValueReference(loadingValueReference);
table.set(index, e);
} else {
e.setValueReference(loadingValueReference);
}
}
} finally {
unlock();
postWriteCleanup();
}
if (createNewEntry) {
try {
// Synchronizes on the entry to allow failing fast when a recursive load is
// detected. This may be circumvented when an entry is copied, but will fail fast most
// of the time.
synchronized (e) {
return loadSync(key, hash, loadingValueReference, loader);
}
} finally {
statsCounter.recordMisses(1);
}
} else {
// The entry already exists. Wait for loading.
return waitForLoadingValue(e, key, valueReference);
}
}
一进入该方法会获取锁,第一个获取锁的线程,可以成功创建loadingRef,之后调用loadSync去load值。后来的线程就会检测到loadingRef处于loading状态了,只能调用下面的方法等待:
V waitForLoadingValue(ReferenceEntry<K, V> e, K key, ValueReference<K, V> valueReference)
throws ExecutionException {
if (!valueReference.isLoading()) {
throw new AssertionError();
}
checkState(!Thread.holdsLock(e), "Recursive load of: %s", key);
// don't consider expiration as we're concurrent with loading
try {
V value = valueReference.waitForValue();
if (value == null) {
throw new InvalidCacheLoadException("CacheLoader returned null for key " + key + ".");
}
// re-read ticker now that loading has completed
long now = map.ticker.read();
recordRead(e, now);
return value;
} finally {
statsCounter.recordMisses(1);
}
}
这个等待是在waitForValue里实现的,也是loadingRef特有的方法。下面是loadingRef的实现:
static class LoadingValueReference<K, V> implements ValueReference<K, V> {
volatile ValueReference<K, V> oldValue;
// TODO(fry): rename get, then extend AbstractFuture instead of containing SettableFuture
final SettableFuture<V> futureValue = SettableFuture.create();
final Stopwatch stopwatch = Stopwatch.createUnstarted();
public LoadingValueReference() {
this(null);
}
public LoadingValueReference(ValueReference<K, V> oldValue) {
this.oldValue = (oldValue == null) ? LocalCache.<K, V>unset() : oldValue;
}
@Override
public boolean isLoading() {
return true;
}
@Override
public boolean isActive() {
return oldValue.isActive();
}
@Override
public int getWeight() {
return oldValue.getWeight();
}
public boolean set(@NullableDecl V newValue) {
return futureValue.set(newValue);
}
public boolean setException(Throwable t) {
return futureValue.setException(t);
}
private ListenableFuture<V> fullyFailedFuture(Throwable t) {
return Futures.immediateFailedFuture(t);
}
@Override
public void notifyNewValue(@NullableDecl V newValue) {
if (newValue != null) {
// The pending load was clobbered by a manual write.
// Unblock all pending gets, and have them return the new value.
set(newValue);
} else {
// The pending load was removed. Delay notifications until loading completes.
oldValue = unset();
}
// TODO(fry): could also cancel loading if we had a handle on its future
}
public ListenableFuture<V> loadFuture(K key, CacheLoader<? super K, V> loader) {
try {
stopwatch.start();
V previousValue = oldValue.get();
if (previousValue == null) {
V newValue = loader.load(key);
return set(newValue) ? futureValue : Futures.immediateFuture(newValue);
}
ListenableFuture<V> newValue = loader.reload(key, previousValue);
if (newValue == null) {
return Futures.immediateFuture(null);
}
// To avoid a race, make sure the refreshed value is set into loadingValueReference
// *before* returning newValue from the cache query.
return transform(
newValue,
new com.google.common.base.Function<V, V>() {
@Override
public V apply(V newValue) {
LoadingValueReference.this.set(newValue);
return newValue;
}
},
directExecutor());
} catch (Throwable t) {
ListenableFuture<V> result = setException(t) ? futureValue : fullyFailedFuture(t);
if (t instanceof InterruptedException) {
Thread.currentThread().interrupt();
}
return result;
}
}
public V compute(K key, BiFunction<? super K, ? super V, ? extends V> function) {
stopwatch.start();
V previousValue;
try {
previousValue = oldValue.waitForValue();
} catch (ExecutionException e) {
previousValue = null;
}
V newValue = function.apply(key, previousValue);
this.set(newValue);
return newValue;
}
public long elapsedNanos() {
return stopwatch.elapsed(NANOSECONDS);
}
@Override
public V waitForValue() throws ExecutionException {
return getUninterruptibly(futureValue);
}
@Override
public V get() {
return oldValue.get();
}
public ValueReference<K, V> getOldValue() {
return oldValue;
}
@Override
public ReferenceEntry<K, V> getEntry() {
return null;
}
@Override
public ValueReference<K, V> copyFor(
ReferenceQueue<V> queue, @NullableDecl V value, ReferenceEntry<K, V> entry) {
return this;
}
}
所以过期load的逻辑是,多线程检测到过期,只有第一个线程可以load值,其余的线程只能阻塞在那里等待第一个线程load值。而refresh也是第一个线程可以load值,但是其他线程会返回原来的值。
具体使用时,我们可以当refresh周期小于expire周期,这样可以防止缓存雪崩。