前言
上篇文章介绍了dalvik虚拟机启动流程,在dalvik虚拟机启动时调用了dvmGcStartup来启动堆。
本文介绍我们在日常开发使用Java时的堆创建流程。
Dalvik堆介绍
Dalvik虚拟机中,堆是由heap[0] Active堆和heap[1] Zygote堆两部分组成的。其中,Zygote堆用来管理Zygote进程在启动过程中预加载和创建的各种对象,而Active堆是在Zygote进程fork第一个子进程之前创建的。
之后无论是Zygote进程还是其子进程,都在Active堆上进行对象分配和释放。这样做的目的是使得Zygote进程和其子进程最大限度地共享Zygote堆所占用的内存。
Dalvik虚拟机管理中的重要结构包括一个Card Table、两个Heap Bitmap和一个GcMarkStack。
HeapBitmap
HeapBitmap是堆的内存分配情况的映射图,它的每一个bit位记录着堆中每8个字节的分配情况。
堆中有两个HeapBitmap,一个称为LiveHeapBitmap,用来记录上次GC之后还存活的对象;另一个称为MarkHeapBitmap,用来记录当前GC中还存活的对象。这样,上次GC后存活的但是当前GC不存活的对象,就是需要释放的对象。
GcMarkStack
Davlk虚拟机使用标记-清除(Mark-Sweep)算法进行GC。在标记阶段,通过一个Mark Stack来实现递归检查被引用的对象,即在当前GC中存活的对象。有了这个Mark Stack,就可以通过循环来模拟函数递归调用。
在垃圾回收的过程中,需要通过递归的方式去检查系统中的每个对象。但是递归太深会引起栈溢出,因此,实际采用的回收算法中用GcMarkStack来保存中间的数据。
CardTable
Card Table是为了记录在垃圾收集过程中对象的引用情况的,用在Concurrent GC第二阶段记录非垃圾收集堆对象对垃圾收集堆对象的引用。后文会分析内存回收流程,即gc流程。
Card Table和Heap Bitmap的作用是类似的。区别在于:
- Card Table不是使用一个bit来描述一个对象,而是用一个byte来描述GC_CARD_SIZE个对象;
- Card Table不是用来描述对象的存活,而是用来描述在Concurrent GC的过程中被修改的对象,这些对象需要进行特殊处理。
初始化zygote堆
- dalvik/vm/alloc/Alloc.cpp
// Initialize the GC universe.
bool dvmGcStartup()
{
dvmInitMutex(&gDvm.gcHeapLock);
pthread_cond_init(&gDvm.gcHeapCond, NULL);
return dvmHeapStartup();
}
- dalvik/vm/alloc/Heap.cpp
初始化堆,当heapGrowthLimit=0时,使用heapMaximumSize
// Initialize the GC heap.
bool dvmHeapStartup()
{
GcHeap *gcHeap;
if (gDvm.heapGrowthLimit == 0) {
gDvm.heapGrowthLimit = gDvm.heapMaximumSize;
}
gcHeap = dvmHeapSourceStartup(gDvm.heapStartingSize, gDvm.heapMaximumSize, gDvm.heapGrowthLimit);
gDvm.gcHeap = gcHeap;
// Set up the lists we'll use for cleared reference objects.
gcHeap->clearedReferences = NULL;
// 初始化cradTable
dvmCardTableStartup(gDvm.heapMaximumSize, gDvm.heapGrowthLimit);
return true;
}
dvmHeapSourceStartup
- dalvik/vm/alloc/HeapSource.cpp
dvmAllocRegion()函数来分配一块内存空间,然后把这块内存空间交给dlmalloc来管理;dvmAllocRegion()函数中使用ashmem_create_region()和mmap()函数来分配需要的内存空间,这也意味着dvmAllocRegion()分配的都是大块的内存。以下几个函数中内存分配都是在使用dvmAllocRegion()分配的内存,并没有从Dalvik的堆上分配,因为这几个对象在系统中会一直存在,不能被回收,因此,直接从系统内存中分配,不用Dalvik管理。
// Initializes the heap source;
GcHeap* dvmHeapSourceStartup(size_t startSize, size_t maximumSize, size_t growthLimit) {
GcHeap *gcHeap;
HeapSource *hs;
mspace msp;
size_t length;
void *base;
// Allocate a contiguous region of virtual memory to subdivided among the heaps managed by the garbage collector.
length = ALIGN_UP_TO_PAGE_SIZE(maximumSize);
base = dvmAllocRegion(length, PROT_NONE, gDvm.zygote ? "dalvik-zygote" : "dalvik-heap");
// Create an unlocked dlmalloc mspace to use as a heap source.
msp = createMspace(base, kInitialMorecoreStart, startSize);
gcHeap = (GcHeap *)calloc(1, sizeof(*gcHeap));
hs = (HeapSource *)calloc(1, sizeof(*hs));
hs->targetUtilization = gDvm.heapTargetUtilization * HEAP_UTILIZATION_MAX;
hs->minFree = gDvm.heapMinFree;
hs->maxFree = gDvm.heapMaxFree;
hs->startSize = startSize;
hs->maximumSize = maximumSize;
hs->growthLimit = growthLimit;
hs->idealSize = startSize;
hs->softLimit = SIZE_MAX; // no soft limit at first
hs->numHeaps = 0;
hs->sawZygote = gDvm.zygote;
hs->nativeBytesAllocated = 0;
hs->nativeFootprintGCWatermark = startSize;
hs->nativeFootprintLimit = startSize * 2;
hs->nativeNeedToRunFinalization = false;
hs->hasGcThread = false;
hs->heapBase = (char *)base;
hs->heapLength = length;
// Add the initial heap. 初始化heapSource中的第一个堆
addInitialHeap(hs, msp, growthLimit);
// Initialize a HeapBitmap so that it points to a bitmap large enough to cover a heap at <base> of <maxSize> bytes
dvmHeapBitmapInit(&hs->liveBits, base, length, "dalvik-bitmap-1");
dvmHeapBitmapInit(&hs->markBits, base, length, "dalvik-bitmap-2");
allocMarkStack(&gcHeap->markContext.stack, hs->maximumSize);
gcHeap->markContext.bitmap = &hs->markBits;
gcHeap->heapSource = hs;
gHs = hs;
return gcHeap;
}
addInitialHeap
addInitialHeap()函数将创建出来的内存放到了heapSource的字段HeapSource[0]里。
Dalvik并没有直接使用系统调用来自己管理动态内存,而是以“私有堆”的形式交给dlmalloc管理。
结构体Heap用来描述一个堆,它的各个成员变量的含义如下所示:
msp:描述堆所使用内存块。
maximumSize:描述堆可以使用的最大内存值。
bytesAllocated:描述堆已经分配的字节数。
concurrentStartBytes:描述堆已经分配的内存达到指定值就要触发并行GC。
objectsAllocated:描述已经分配的对象数。
base:描述堆所使用的内存块的起始地址。
limit:描述堆所使用的内存块的结束地址。
brk:描述当前堆所分配的最大内存值。
//Add the initial heap.
static bool addInitialHeap(HeapSource *hs, mspace msp, size_t maximumSize)
{
if (hs->numHeaps != 0) {
return false;
}
hs->heaps[0].msp = msp;
hs->heaps[0].maximumSize = maximumSize;
hs->heaps[0].concurrentStartBytes = SIZE_MAX;
hs->heaps[0].base = hs->heapBase;
hs->heaps[0].limit = hs->heapBase + maximumSize;
hs->heaps[0].brk = hs->heapBase + kInitialMorecoreStart;
hs->numHeaps = 1;
return true;
}
struct Heap {
/* The mspace to allocate from.
*/
mspace msp;
/* The largest size that this heap is allowed to grow to.
*/
size_t maximumSize;
/* Number of bytes allocated from this mspace for objects,
* including any overhead. This value is NOT exact, and
* should only be used as an input for certain heuristics.
*/
size_t bytesAllocated;
/* Number of bytes allocated from this mspace at which a
* concurrent garbage collection will be started.
*/
size_t concurrentStartBytes;
/* Number of objects currently allocated from this mspace.
*/
size_t objectsAllocated;
/*
* The lowest address of this heap, inclusive.
*/
char *base;
/*
* The highest address of this heap, exclusive.
*/
char *limit;
/*
* If the heap has an mspace, the current high water mark in
* allocations requested via dvmHeapSourceMorecore.
*/
char *brk;
};
dvmHeapBitmapInit
- dvmHeapBitmapInit()函数创建了两个HeapBitmap的对象,HeapBitmap是堆的内存分配情况的映射图,它的每一个bit位记录着堆中每8个字节的分配情况。
- HeapBitmap结构体中的最重要的就是成员变量bits指向的一个类型为unsigned long的数组,这个数组的每一个bit都用来标记一个对象是否存活。
- Java堆的起始地址为base,大小为maxSize,由此我们就知道,在Java堆上创建的对象的地址范围为[base, maxSize)。但是通过C库提供的mspace_malloc来在Java堆分配内存时,得到的内存地址是以8字节对齐的。这意味着我们只需要(maxSize / 8)个bit来描述Java堆的对象。结构体HeapBitmap的成员变量bits是一个类型为unsigned long的数组,也就是说,数组中的每一个元素都可以描述sizeof(unsigned long)个对象的存活。在32位设备上,一个unsigned long占用32个bit,这意味着需要一个大小为(maxSize / 8 / 32)的unsigned long数组来描述Java堆对象的存活。如果换成字节数来描述的话,就是说我们需要一块大小为(maxSize / 8 / 32) × 4的内存块来描述一个大小为maxSize的Java堆对象。
// Initialize a HeapBitmap so that it points to a bitmap large enough to cover a heap at <base> of <maxSize> bytes, where objects are guaranteed to be HB_OBJECT_ALIGNMENT-aligned.
bool dvmHeapBitmapInit(HeapBitmap *hb, const void *base, size_t maxSize, const char *name) {
void *bits;
size_t bitsLen;
bitsLen = HB_OFFSET_TO_INDEX(maxSize) * sizeof(*hb->bits);
bits = dvmAllocRegion(bitsLen, PROT_READ | PROT_WRITE, name);
if (bits == NULL) {
ALOGE("Could not mmap %zd-byte ashmem region '%s'", bitsLen, name);
return false;
}
hb->bits = (unsigned long *)bits;
hb->bitsLen = hb->allocLen = bitsLen;
hb->base = (uintptr_t)base;
hb->max = hb->base - 1;
return true;
}
struct HeapBitmap {
/* The bitmap data, which points to an mmap()ed area of zeroed
* anonymous memory.
*/
unsigned long *bits;
/* The size of the used memory pointed to by bits, in bytes. This
* value changes when the bitmap is shrunk.
*/
size_t bitsLen;
/* The real size of the memory pointed to by bits. This is the
* number of bytes we requested from the allocator and does not
* change.
*/
size_t allocLen;
/* The base address, which corresponds to the first bit in
* the bitmap.
*/
uintptr_t base;
/* The highest pointer value ever returned by an allocation
* from this heap. I.e., the highest address that may correspond
* to a set bit. If there are no bits set, (max < base).
*/
uintptr_t max;
};
allocMarkStack
-
allocMarkStack()函数分配了一块内存,并用它来初始化GcMarkStack结构。
在垃圾回收的过程中,需要通过递归的方式去检查系统中的每个对象。但是递归太深会引起栈溢出,因此,实际采用的回收算法中用GcMarkStack来保存中间的数据。 -
GcMarkStack通过一个Object数组来描述一个栈。这个Object数组的大小通过成员变量length来描述。成员变量base和limit分别描述栈的最低地址和最高地址,另外一个成员变量top指向栈顶。
-
我们分析一下需要一个多大的栈来描述Java堆的所有对象。首先,每一个Java对象都是必须要从Object结构体继承下来的,这意味着每一个Java对象占用的内存都至少为sizeof(Object)。其次,通过C库提供的接口mspace_malloc在Java堆上为对象分配内存时,C库自己需要一些额外的内存来管理该块内存,例如用额外的4个字节来记录分配出去的内存块的大小。额外需要的内存大小通过宏HEAP_SOURCE_CHUNK_OVERHEAD来描述。最后,我们就可以知道,一个大小为maximumSize的Java堆,在最坏情况下,存在(maximumSize / (sizeof(Object) + HEAP_SOURCE_CHUNK_OVERHEAD))个对象。也就是说,GcMarkStack通过一个大小为(maximumSize / (sizeof(Object) + HEAP_SOURCE_CHUNK_OVERHEAD))的Object数组来描述一个栈。如果换成字节数来描述的话,就是说我们需要一块大小为(maximumSize * sizeof(Object) / (sizeof(Object) + HEAP_SOURCE_CHUNK_OVERHEAD))的内存块来描述一个GcMarkStack栈。
// Create a stack big enough for the worst possible case, where the heap is perfectly full of the smallest object.
static bool allocMarkStack(GcMarkStack *stack, size_t maximumSize)
{
const char *name = "dalvik-mark-stack";
void *addr;
stack->length = maximumSize * sizeof(Object*) /
(sizeof(Object) + HEAP_SOURCE_CHUNK_OVERHEAD);
addr = dvmAllocRegion(stack->length, PROT_READ | PROT_WRITE, name);
if (addr == NULL) {
return false;
}
stack->base = (const Object **)addr;
stack->limit = (const Object **)((char *)addr + stack->length);
stack->top = NULL;
madvise(stack->base, stack->length, MADV_DONTNEED);
return true;
}
struct GcMarkStack {
/* Highest address (exclusive)
*/
const Object **limit;
/* Current top of the stack (exclusive)
*/
const Object **top;
/* Lowest address (inclusive)
*/
const Object **base;
/* Maximum stack size, in bytes.
*/
size_t length;
};
dvmCardTableStartup
- 在Card Table中,用一个byte来描述128个对象。
每当一个对象在Concurrent GC的过程中被修改时,典型的情景就是我们通过函数dvmSetFieldObje修改了该对象的引用类型的成员变量。在这种情况下,该对象在Card Table中对应的字节会被设置为GC_CARD_DIRTY。相反,如果一个对象在Concurrent GC的过程中没有被修改,那么它在Card Table中对应的字节会保持为GC_CARD_CLEAN。 - 成员变量cardTableBase和cardTableMaxLength描述的是创建的Card Table和起始地址和大小。成员变量cardTableLength描述的当前Card Table使用的大小。成员变量cardTableMaxLength和cardTableLength的关系就对应于Java堆的最大值(Maximum Size)和增长上限值(Growth Limit)的关系。
/*
* Maintain a card table from the the write barrier. All writes of
* non-NULL values to heap addresses should go through an entry in
* WriteBarrier, and from there to here.
*
* The heap is divided into "cards" of GC_CARD_SIZE bytes, as
* determined by GC_CARD_SHIFT. The card table contains one byte of
* data per card, to be used by the GC. The value of the byte will be
* one of GC_CARD_CLEAN or GC_CARD_DIRTY.
*
* After any store of a non-NULL object pointer into a heap object,
* code is obliged to mark the card dirty. The setters in
* ObjectInlines.h [such as dvmSetFieldObject] do this for you. The
* JIT and fast interpreters also contain code to mark cards as dirty.
*
* The card table's base [the "biased card table"] gets set to a
* rather strange value. In order to keep the JIT from having to
* fabricate or load GC_DIRTY_CARD to store into the card table,
* biased base is within the mmap allocation at a point where it's low
* byte is equal to GC_DIRTY_CARD. See dvmCardTableStartup for details.
*/
/*
* Initializes the card table; must be called before any other
* dvmCardTable*() functions.
*/
bool dvmCardTableStartup(size_t heapMaximumSize, size_t growthLimit)
{
size_t length;
void *allocBase;
u1 *biasedBase;
GcHeap *gcHeap = gDvm.gcHeap;
int offset;
void *heapBase = dvmHeapSourceGetBase();
assert(gcHeap != NULL);
assert(heapBase != NULL);
/* All zeros is the correct initial value; all clean. */
assert(GC_CARD_CLEAN == 0);
/* Set up the card table */
length = heapMaximumSize / GC_CARD_SIZE;
/* Allocate an extra 256 bytes to allow fixed low-byte of base */
allocBase = dvmAllocRegion(length + 0x100, PROT_READ | PROT_WRITE,
"dalvik-card-table");
if (allocBase == NULL) {
return false;
}
gcHeap->cardTableBase = (u1*)allocBase;
gcHeap->cardTableLength = growthLimit / GC_CARD_SIZE;
gcHeap->cardTableMaxLength = length;
biasedBase = (u1 *)((uintptr_t)allocBase -
((uintptr_t)heapBase >> GC_CARD_SHIFT));
offset = GC_CARD_DIRTY - ((uintptr_t)biasedBase & 0xff);
gcHeap->cardTableOffset = offset + (offset < 0 ? 0x100 : 0);
biasedBase += gcHeap->cardTableOffset;
assert(((uintptr_t)biasedBase & 0xff) == GC_CARD_DIRTY);
gDvm.biasedCardTableBase = biasedBase;
return true;
}
初始化active堆
- 直到dvmHeapStartup()函数结束,heapSource中的两个“堆”只有heaps[0]初始化了,heaps[1]仍然为NULL。因为dvmHeapStartup()的调用是在Zygote进程中进行的。
- 在第一个应用启动前,还会继续完成Dalvik内存模块的初始化工作,但该初始化active heap只会进行一次,由gDvm.newZygoteHeapAllocated布尔变量控制,即Zygote进程只会在fork第一个子进程的时候,才会将Java堆划一分为二来管理;这么设计是因为 We create a heap for all future zygote process allocations, in an attempt to avoid touching pages in the zygote heap。
- 在Zygote的nativeFork()函数中还会调用dvmGcPreZygoteFork()函数,其中会调用函数dvmHeapSourceStartupBeforeFork()去初始化active堆,并把该active堆放到heap数组前面,以后无论是Zygote进程,还是Zygote子进程,需要分配对象时,都在Active堆上进行。这样就可以使得Zygote堆最大限度地在Zygote进程及其子进程中共享。
- dalvik/vm/native/dalvik_system_Zygote.cpp
static void Dalvik_dalvik_system_Zygote_fork(const u4* args, JValue* pResult)
{
pid_t pid;
dvmGcPreZygoteFork(); // 在fork前分配active堆
setSignalHandler();
dvmDumpLoaderStats("zygote");
pid = fork();
RETURN_INT(pid);
}
- dalvik/vm/alloc/Alloc.cpp
// Do any last-minute preparation before we call fork() for the first time.
bool dvmGcPreZygoteFork() {
return dvmHeapSourceStartupBeforeFork();
}
/*
* This is called while in zygote mode, right before we fork() for the
* first time. We create a heap for all future zygote process allocations,
* in an attempt to avoid touching pages in the zygote heap. (This would
* probably be unnecessary if we had a compacting GC -- the source of our
* troubles is small allocations filling in the gaps from larger ones.)
*/
bool dvmHeapSourceStartupBeforeFork()
{
HeapSource *hs = gHs; // use a local to avoid the implicit "volatile"
if (!gDvm.newZygoteHeapAllocated) {
// Ensure heaps are trimmed to minimize footprint pre-fork.
trimHeaps();
// Create a new heap for post-fork zygote allocations. We only try once, even if it fails.
gDvm.newZygoteHeapAllocated = true;
return addNewHeap(hs);
}
return true;
}
addNewHeap
addNewHeap()函数主要的功能是创建了一个新的堆。
创建的过程是将旧的heaps[0]第一页以后的内存地址空间分给了新的堆,然后对新堆的地址空间在原来地址的基础上重新执行mmap。接下来将heaps[0]指向的堆的尺寸减小为一页大小,最后将heaps[0]和heaps[1]的值交换。
因此,两个堆都创建后,大小和以前还是一样,但是heaps[0]指向了一个新的、未分配内存的堆,而heaps[1]则包含了初始化时创建的内存对象,以后的内存分配都将在heaps[0]中进行。
// Adds an additional heap to the heap source. Returns false if there are too many heaps or insufficient free space to add another heap.
static bool addNewHeap(HeapSource *hs)
{
Heap heap;
memset(&heap, 0, sizeof(heap));
// Heap storage comes from a common virtual memory reservation. The new heap will start on the page after the old heap.
char *base = hs->heaps[0].brk;
size_t overhead = base - hs->heaps[0].base;
size_t morecoreStart = SYSTEM_PAGE_SIZE;
heap.maximumSize = hs->growthLimit - overhead;
heap.concurrentStartBytes = hs->minFree - CONCURRENT_START;
heap.base = base;
heap.limit = heap.base + heap.maximumSize;
heap.brk = heap.base + morecoreStart;
remapNewHeap(hs, &heap);
heap.msp = createMspace(base, morecoreStart, hs->minFree);
// Don't let the soon-to-be-old heap grow any further
hs->heaps[0].maximumSize = overhead;
hs->heaps[0].limit = base;
mspace_set_footprint_limit(hs->heaps[0].msp, overhead);
// Put the new heap in the list, at heaps[0]
memmove(&hs->heaps[1], &hs->heaps[0], hs->numHeaps * sizeof(hs->heaps[0]));
hs->heaps[0] = heap;
hs->numHeaps++;
return true;
}
/*
* A helper for addNewHeap(). Remap the new heap so that it will have
* a separate ashmem region with possibly a different name, etc. In
* practice, this is used to give the app heap a separate ashmem
* region from the zygote heap's.
*/
static bool remapNewHeap(HeapSource* hs, Heap* newHeap)
{
char* newHeapBase = newHeap->base;
size_t rem_size = hs->heapBase + hs->heapLength - newHeapBase;
munmap(newHeapBase, rem_size);
int fd = ashmem_create_region("dalvik-heap", rem_size);
void* addr = mmap(newHeapBase, rem_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
int ret = close(fd);
return true;
}