Native memory leaks in one step

Virtual memory is also exhausted

As Android developers, we must have experienced the switching of APP from 32-bit to 64-bit architecture. At present, there are still requirements for 32-bit architecture in the domestic market, and there is no comprehensive prohibition. One disadvantage of 32-bit architecture is that there is too little virtual memory that can be allocated to user space (generally half is reserved for kernel space, which can be configured), so it often leads to virtual memory. Insufficient memory causes OOM. After switching to a 64-bit architecture, on ARM64, the default virtual memory size that can be allocated to a process is 2^39 when the page size is 4kb. In fact, 64 bits cannot be completely allocated, but the magnitude of 39 is still the same. The available virtual memory size is much larger than that of 32-bit, so often after we upgrade to 64-bit architecture adaptation, the problem of insufficient virtual memory will be alleviated. What is more interesting here is that it is only alleviated if your application exists for a long time If not, OOM will still be triggered due to insufficient virtual memory, even if the virtual memory is large, such as a large number of virtual memory leaks.

Examples we can see, such as mmap allocation failure, because native Thread creation requires mmap to create a layer of stack space, or other failures when calling mmap to allocate memory

java.lang.OutOfMemoryError: Could not allocate JNI Env: Failed anonymous mmap(0x0, 8192, 0x34, 0x220, -1, 0): Out of memory. See process maps in the log.
        at java.lang.Thread.nativeCreate(Thread.java)
        at java.lang.Thread.start(Thread.java:733)
        at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:975)
        at java.util.concurrent.ThreadPoolExecutor.processWorkerExit(ThreadPoolExecutor.java:1043)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1185)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
        at java.lang.Thread.run(Thread.java:764)

Therefore, it is very important to find the culprit. In fact, since the ART virtual machine itself causes a lot of native memory leaks, the Android team added the libmemunreachable module after Android N to detect native memory leaks.

libmemunreachable

In MemUnreachable.cpp , a method to obtain unreachable memory addresses is provided, GetUnreachableMemory

GetUnreachableMemory(UnreachableMemoryInfo& info, size_t limit)

Through GetUnreachableMemory, we can get the information of the unreachable address in the info function.

384 bytes in 9 allocations unreachable out of 20003960 bytes in 40784 allocations
  384 bytes in 9 unreachable allocations
  ABI: 'arm64'
  320 bytes unreachable at 7e879d09c0
  8 bytes unreachable at 7e8d891160
  8 bytes unreachable at 7e8d9fec78
  ....

It can be seen that there is still a lot of information obtained, such as the size of the leak and the address of the leak. Of course, most of this so is used for ART self-test. What if we want to use it, for example, we want to monitor What should I do if there is a native memory leak in my APP? Don't worry, there are ways! Before using it, we will talk about the principle. If you are not interested, you can directly transfer to the usage section.

Let's think about it, if we want to do memory leak detection, how do we generally do it?

The first step is to check whether the memory is reachable, right? This step is very important. For example, our Java heap memory checks whether the memory is reachable. In fact, it starts from some gc roots. If we can find the reference chain from the existing objects to the gc root, it proves The memory is used, otherwise it is unreachable memory and will be reclaimed by the virtual machine GC.

The same is true for the Native layer. The detection of whether the memory is reachable is also based on the Root memory (Root is the memory currently being used, such as the memory associated with the virtual machine Heap, or the memory within the thread stack range), and then judge Is there a memory leak? In fact, the Native judgment is easier than the Java layer judgment, because as long as the Native memory does not have the memory of the Root reference chain, it must be leaking memory , because the Native layer does not have a GC mechanism like Java. If it is not released , it has always existed, and this needs to be paid attention to, so when we look for leaked memory, it becomes just to find unreachable memory

image.png

Let's enter the analysis of the source code

GetUnreachableMemory

bool GetUnreachableMemory(UnreachableMemoryInfo& info, size_t limit) {
    if (info.version > 0) {
        MEM_ALOGE("unsupported UnreachableMemoryInfo.version %zu in GetUnreachableMemory",
                  info.version);
        return false;
    }

    int parent_pid = getpid();
    int parent_tid = gettid();

    Heap heap;

    AtomicState<State> state(STARTING);
    LeakPipe pipe;

    PtracerThread thread{[&]() -> int {
        /
        // Collection thread
        /
        MEM_ALOGI("collecting thread info for process %d...", parent_pid);

        if (!state.transition_or(STARTING, PAUSING, [&] {
            MEM_ALOGI("collecting thread expected state STARTING, aborting");
            return ABORT;
        })) {
            return 1;
        }

        ThreadCapture thread_capture(parent_pid, heap);
        allocator::vector<ThreadInfo> thread_info(heap);
        allocator::vector<Mapping> mappings(heap);
        allocator::vector<uintptr_t> refs(heap);
        这里主要做一些自检
        // ptrace all the threads
        if (!thread_capture.CaptureThreads()) {
            state.set(ABORT);
            return 1;
        }

        // collect register contents and stacks
        if (!thread_capture.CapturedThreadInfo(thread_info)) {
            state.set(ABORT);
            return 1;
        }

        // snapshot /proc/pid/maps
        if (!ProcessMappings(parent_pid, mappings)) {
            state.set(ABORT);
            return 1;
        }

        if (!BinderReferences(refs)) {
            state.set(ABORT);
            return 1;
        }

        // Atomically update the state from PAUSING to COLLECTING.
        // The main thread may have given up waiting for this thread to finish
        // pausing, in which case it will have changed the state to ABORT.
        if (!state.transition_or(PAUSING, COLLECTING, [&] {
            MEM_ALOGI("collecting thread aborting");
            return ABORT;
        })) {
            return 1;
        }

        // malloc must be enabled to call fork, at_fork handlers take the same
        // locks as ScopedDisableMalloc.  All threads are paused in ptrace, so
        // memory state is still consistent.  Unfreeze the original thread so it
        // can drop the malloc locks, it will block until the collection thread
        // exits.
        thread_capture.ReleaseThread(parent_tid);
       
        因为存在耗时,所以fork子进程去处理检测
        // fork a process to do the heap walking
        int ret = fork();
        if (ret < 0) {
            return 1;
        } else if (ret == 0) {
            /
            // Heap walker process
            /
            // Examine memory state in the child using the data collected above and
            // the CoW snapshot of the process memory contents.

            if (!pipe.OpenSender()) {
                _exit(1);
            }

            MemUnreachable unreachable{parent_pid, heap};
           这里很关键,是分析的开始,这里注意参数,是Root的起点
            if (!unreachable.CollectAllocations(thread_info, mappings, refs)) {
                _exit(2);
            }
            size_t num_allocations = unreachable.Allocations();
            size_t allocation_bytes = unreachable.AllocationBytes();

            allocator::vector<Leak> leaks{heap};

            size_t num_leaks = 0;
            size_t leak_bytes = 0;
            前面配置好Root 后,就发起查找GetUnreachableMemory
            bool ok = unreachable.GetUnreachableMemory(leaks, limit, &num_leaks, &leak_bytes);
            检测完通过管道pipe通知到父进程即可
            ok = ok && pipe.Sender().Send(num_allocations);
            ok = ok && pipe.Sender().Send(allocation_bytes);
            ok = ok && pipe.Sender().Send(num_leaks);
            ok = ok && pipe.Sender().Send(leak_bytes);
            ok = ok && pipe.Sender().SendVector(leaks);

            if (!ok) {
                _exit(3);
            }

            _exit(0);
        } else {
            // Nothing left to do in the collection thread, return immediately,
            // releasing all the captured threads.
            MEM_ALOGI("collection thread done");
            return 0;
        }
    }};

    /
    // Original thread
    /

    {
        // Disable malloc to get a consistent view of memory
        ScopedDisableMalloc disable_malloc;

        // Start the collection thread
        thread.Start();
        如果等待超时会abort
        // Wait for the collection thread to signal that it is ready to fork the
        // heap walker process.
        if (!state.wait_for_either_of(COLLECTING, ABORT, 30s)) {
            // The pausing didn't finish within 30 seconds, attempt to atomically
            // update the state from PAUSING to ABORT.  The collecting thread
            // may have raced with the timeout and already updated the state to
            // COLLECTING, in which case aborting is not necessary.
            if (state.transition(PAUSING, ABORT)) {
                MEM_ALOGI("main thread timed out waiting for collecting thread");
            }
        }

        // Re-enable malloc so the collection thread can fork.
    }

    // Wait for the collection thread to exit
    int ret = thread.Join();
    if (ret != 0) {
        return false;
    }

    // Get a pipe from the heap walker process.  Transferring a new pipe fd
    // ensures no other forked processes can have it open, so when the heap
    // walker process dies the remote side of the pipe will close.
    if (!pipe.OpenReceiver()) {
        return false;
    }
    通过管道接受子进程处理好的数据,然后返回
    bool ok = true;
    ok = ok && pipe.Receiver().Receive(&info.num_allocations);
    ok = ok && pipe.Receiver().Receive(&info.allocation_bytes);
    ok = ok && pipe.Receiver().Receive(&info.num_leaks);
    ok = ok && pipe.Receiver().Receive(&info.leak_bytes);
    ok = ok && pipe.Receiver().ReceiveVector(info.leaks);
    if (!ok) {
        return false;
    }

    MEM_ALOGI("unreachable memory detection done");
    MEM_ALOGE("%zu bytes in %zu allocation%s unreachable out of %zu bytes in %zu allocation%s",
              info.leak_bytes, info.num_leaks, plural(info.num_leaks), info.allocation_bytes,
              info.num_allocations, plural(info.num_allocations));
    return true;
}

GetUnreachableMemory is actually an entry method. It detects memory leaks through the fork child process. The reason is that the current process will continue to allocate memory. If analysis is required, the process will be blocked. Because it involves thread suspension and other operations, it will go through the child process. analyze. After the subprocess is analyzed, the data can be written back through the pipe. Here we focus on marking the CollectAllocations method

Root object

Before introducing CollectAllocations, we need to know how the Root object is added. We also said just now that starting from the Root reference chain, unreachable memory is the leaked memory, so the selection of Root is very critical. The method of adding Root is as follows

void HeapWalker::Root(uintptr_t begin, uintptr_t end) {
    roots_.push_back(Range{begin, end});
}

void HeapWalker::Root(const allocator::vector<uintptr_t>& vals) {
    root_vals_.insert(root_vals_.end(), vals.begin(), vals.end());
}

Here we understand that the next step of CollectAllocations should be to add Root objects and trigger detection

CollectAllocations

bool MemUnreachable::CollectAllocations(const allocator::vector<ThreadInfo>& threads,
                                        const allocator::vector<Mapping>& mappings,
                                        const allocator::vector<uintptr_t>& refs) {
                                       
    MEM_ALOGI("searching process %d for allocations", pid_);

    for (auto it = mappings.begin(); it != mappings.end(); it++) {
        heap_walker_.Mapping(it->begin, it->end);
    }
    同样做自检
    allocator::vector<Mapping> heap_mappings{mappings};
    allocator::vector<Mapping> anon_mappings{mappings};
    allocator::vector<Mapping> globals_mappings{mappings};
    allocator::vector<Mapping> stack_mappings{mappings};
    if (!ClassifyMappings(mappings, heap_mappings, anon_mappings, globals_mappings, stack_mappings)) {
        return false;
    }
 
    for (auto it = heap_mappings.begin(); it != heap_mappings.end(); it++) {
        MEM_ALOGV("Heap mapping %" PRIxPTR "-%" PRIxPTR " %s", it->begin, it->end, it->name);
       
        HeapIterate(*it,
                    [&](uintptr_t base, size_t size) { heap_walker_.Allocation(base, base + size); });
    }

    for (auto it = anon_mappings.begin(); it != anon_mappings.end(); it++) {
        MEM_ALOGV("Anon mapping %" PRIxPTR "-%" PRIxPTR " %s", it->begin, it->end, it->name);
        打上地址标记
        heap_walker_.Allocation(it->begin, it->end);
    }

    for (auto it = globals_mappings.begin(); it != globals_mappings.end(); it++) {
        MEM_ALOGV("Globals mapping %" PRIxPTR "-%" PRIxPTR " %s", it->begin, it->end, it->name);
        设置map地址为root
        heap_walker_.Root(it->begin, it->end);
    }

    for (auto thread_it = threads.begin(); thread_it != threads.end(); thread_it++) {
        for (auto it = stack_mappings.begin(); it != stack_mappings.end(); it++) {
            if (thread_it->stack.first >= it->begin && thread_it->stack.first <= it->end) {
                MEM_ALOGV("Stack %" PRIxPTR "-%" PRIxPTR " %s", thread_it->stack.first, it->end, it->name);
                当前有效线程的栈地址 作为root
                heap_walker_.Root(thread_it->stack.first, it->end);
            }
        }
        
        heap_walker_.Root(thread_it->regs);
    }
    heap相关地址设置为root
    heap_walker_.Root(refs);

    MEM_ALOGI("searching done");

    return true;
}



DetectLeaks

After configuring Root, we will return to the sub-process processing logic in GetUnreachableMemory, which will have such a piece of code bool ok = unreachable.GetUnreachableMemory(leaks, limit, &num_leaks, &leak_bytes);

Here is the configuration of Root, which triggers the detection of leaks

bool HeapWalker::DetectLeaks() {
    // Recursively walk pointers from roots to mark referenced allocations
    for (auto it = roots_.begin(); it != roots_.end(); it++) {
        查找是否存在与Root的引用链
        RecurseRoot(*it);
    }

    Range vals;
    vals.begin = reinterpret_cast<uintptr_t>(root_vals_.data());
    vals.end = vals.begin + root_vals_.size() * sizeof(uintptr_t);

    RecurseRoot(vals);

    if (segv_page_count_ > 0) {
        MEM_ALOGE("%zu pages skipped due to segfaults", segv_page_count_);
    }

    return true;
}

Find the connection of an address to Root

void HeapWalker::RecurseRoot(const Range& root) {
  allocator::vector<Range> to_do(1, root, allocator_);
  while (!to_do.empty()) {
    Range range = to_do.back();
    to_do.pop_back();

    walking_range_ = range;
    ForEachPtrInRange(range, [&](Range& ref_range, AllocationInfo* ref_info) {
      if (!ref_info->referenced_from_root) {
        如果能在有效地址找到,那么证明这个地址属于有效引用,标记为true
        ref_info->referenced_from_root = true;
        to_do.push_back(ref_range);
      }
    });
    walking_range_ = Range{0, 0};
  }
}

After that, it is the process of writing the leak address, which is explained in the previous source code, so we won’t go into details here

Use libmemunreachable

Although it is a system so, it does not prevent us from using this method to obtain leaked memory. We only need to pass the dlsym and symbol to call the GetUnreachableMemory method.

The GetUnreachableMemory symbol is a bit different in different versions of Android

Greater than api 26 symbols are

_ZN7android26GetUnreachableMemoryStringEbm

Symbols less than api 26 but greater than or equal to 24 are

_Z26GetUnreachableMemoryStringbm

Therefore, we can directly call it through symbols, because dlopen has certain restrictions after Android 7, here we can directly use shadowhook_dlopen to open it (of course, we can also use some other means, such as simulating the initiation of built-in functions, we will not go into details here, we mentioned earlier in this article )

void *handle = shadowhook_dlopen("libmemunreachable.so");
void *func;
if (android_get_device_api_level() > __ANDROID_API_O__) {
    func = shadowhook_dlsym(handle,
                            "_ZN7android26GetUnreachableMemoryStringEbm");
} else {
    func = shadowhook_dlsym(handle,
                            "_Z26GetUnreachableMemoryStringbm");
}

std::string result = ((std::string (*)(bool , size_t )) func)(false, 1024);
__android_log_print(ANDROID_LOG_ERROR, "hello", "%s", result.c_str());
return result;

Of course, on the premise of using this function, we also need to set DUMPABLE to 1 through prctl call, because the analysis data uses ptrace, so this flag is necessary

if (prctl(PR_SET_DUMPABLE, 1, 0, 0, 0) == -1) {
    return unreachable_mem;
}

Of course, because what we get is a string of strings, if we only want the size and address information inside , we also need to extract the valid content through regular expressions, the content is as follows

384 bytes in 9 allocations unreachable out of 20003960 bytes in 40784 allocations
  384 bytes in 9 unreachable allocations
  ABI: 'arm64'
  320 bytes unreachable at 7e879d09c0
  8 bytes unreachable at 7e8d891160
  8 bytes unreachable at 7e8d9fec78
  ....

For example, the data we only want is 320 bytes unreachable at 7e879d09c0, and 320 and 7e879d09c0 in this line can be matched by the following code

regex_t reg;
regmatch_t match[1];
匹配有效行
char *pattern = "[0-9]+ bytes unreachable at [A-Za-z0-9]+";

if (regcomp(&reg, pattern, REG_EXTENDED) != 0) {
    printf("regcomp error\n");
    return 1;
}


while (regexec(&reg, unreachable_memory, 1, match, 0) == 0) {
    __android_log_print(ANDROID_LOG_ERROR, "hello",
                        "Match found at position %zd, length %ld: %.*s\n", match[0].rm_so,
                        match[0].rm_eo - match[0].rm_so, match[0].rm_eo - match[0].rm_so,
                        unreachable_memory + match[0].rm_so);
    char result[100] = {""};
    strncpy(result, unreachable_memory + match[0].rm_so, match[0].rm_eo - match[0].rm_so);
    __android_log_print(ANDROID_LOG_ERROR, "hello", "裁剪字符串为 %s", result);
    // 不关心字符串部分,只关心数字部分
    unsigned long addr = strtoul(strrchr(result, ' ') + 1, NULL, 16);
    unsigned long size = strtoul(result, NULL, 10);
    __android_log_print(ANDROID_LOG_ERROR, "hello", "裁剪字符串size %lu %lu", size, addr);
    unreachable_memory += match[0].rm_eo;

    uint64_t leak = addr + size;
    __android_log_print(ANDROID_LOG_ERROR, "hello", "leak is %lu", leak);

}
regfree(&reg);

Summarize

At this point, we can find the leaked memory address and size through libmemunreachable. Of course, the information here may not be enough, such as to get the leaked stack information, etc. At this time, we need to hook some allocation functions, such as malloc mmap, etc., I will not give it here, emmm, I will fill in this hole when I have a chance!

at last

If you want to become an architect or want to break through the 20-30K salary range, then don't be limited to coding and business, but you must be able to select models, expand, and improve programming thinking. In addition, a good career plan is also very important, and the habit of learning is very important, but the most important thing is to be able to persevere. Any plan that cannot be implemented consistently is empty talk.

If you have no direction, here is a set of "Advanced Notes on Eight Android Modules" written by a senior architect of Ali, to help you organize the messy, scattered and fragmented knowledge systematically, so that you can systematically and efficiently master various knowledge points of Android development.
insert image description here
Compared with the fragmented content we usually read, the knowledge points of this note are more systematic, easier to understand and remember, and are arranged strictly according to the knowledge system.

Full set of video materials:

1. Interview collection

insert image description here
2. Source code analysis collection
insert image description here

3. The collection of open source frameworks
insert image description here
welcomes everyone to support with one click and three links. If you need the information in the article, just click the CSDN official certification WeChat card at the end of the article to get it for free↓↓↓

Guess you like

Origin blog.csdn.net/Eqiqi/article/details/131898094