Android ANR -- anr实例1

1. ANR patthern

ANR时候,从trace看主线程是在restoreDefaultApn,具体的操作是在delete,通过binder来实现,查看binderinfo发现是在phone进程里面执行,不过搜索对应线程trace发现,binder线程是空闲的,如Binder_2(trace是网上拷贝不是当时的),不清楚这是问啥,正常情况应该对应线程的trace也在delete()才是。

---pid 17348 at 2018-01-17 16:06:13
Cmd line: con.Android.settings

"main" tid=1 Native
at android.os.BinderProxy.transactNative(Native Method
at android.os.BinderProxy.transact(Binder.java)
at Android.content.ContentProviderProxy.delete ()ContentProviderNative.java
at com.android.settings.ResetNetworkConfirm.restoreDefaultApn(ResetNetworkConfirm.java:158)

---pid 17348 at 2018-01-17 16:06:15
Cmd line: con.Android.phone

"Binder_2" prio=5 tid=8 Native
  | group="main" sCount=1 dsCount=0 obj=0x12f9e0a0 self=0x7f76957400
  | sysTid=22049 nice=0 cgrp=default sched=0/0 handle=0x7f7b92f440
  | state=S schedstat=( 3091875 6286929 20 ) utm=0 stm=0 core=5 HZ=100
  | stack=0x7f7b833000-0x7f7b835000 stackSize=1013KB
  | held mutexes=
  native: pc 0000000000068c88  /system/lib64/libc.so (__ioctl+4)
  native: pc 0000000000072670  /system/lib64/libc.so (ioctl+100)
  native: pc 000000000002d3b4  /system/lib64/libbinder.so (android::IPCThreadState::talkWithDriver(bool)+164)
  native: pc 000000000002dc88  /system/lib64/libbinder.so (android::IPCThreadState::getAndExecuteCommand()+24)
  native: pc 000000000002dda4  /system/lib64/libbinder.so (android::IPCThreadState::joinThreadPool(bool)+76)
  native: pc 00000000000367e8  /system/lib64/libbinder.so (???)
  native: pc 0000000000015954  /system/lib64/libutils.so (android::Thread::_threadLoop(void*)+208)
  native: pc 000000000009392c  /system/lib64/libandroid_runtime.so (android::AndroidRuntime::javaThreadShell(void*)+96)
  native: pc 00000000000151a4  /system/lib64/libutils.so (???)
  native: pc 00000000000660c4  /system/lib64/libc.so (__pthread_start(void*)+52)
  native: pc 000000000001ee84  /system/lib64/libc.so (__start_thread+16)

2.Code Flow

看代码,当时是在恢复设置,包含wifi  bt ims apn等等,很多东西,打印Log会发现这个button执行完的时间大概是2s,TelephonyProvider.delete操作会去持有锁,当快速点击多次后面的就会等锁。这就很容易导致anr

@/packages/apps/Settings/src/com/android/settings/ResetNetworkConfirm.java
private Button.OnClickListener mFinalClickListener = new Button.OnClickListener() {
    @Override
    public void onClick(View v) {
        if (Utils.isMonkeyRunning()) {  //monkey return
            return;
        }

        connectivityManager.factoryReset();
        wifiManager.factoryReset();
        telephonyManager.factoryReset(mSubId);
        policyManager.factoryReset(subscriberId);
        btAdapter.factoryReset();
        ImsManager.factoryReset(context);
        restoreDefaultApn(context);
    }
};
  
184    private void restoreDefaultApn(Context context) {
185        Uri uri = Uri.parse(ApnSettings.RESTORE_CARRIERS_URI);
186
187        if (SubscriptionManager.isUsableSubIdValue(mSubId)) {
188            uri = Uri.withAppendedPath(uri, "subId/" + String.valueOf(mSubId));
189        }
190
191        ContentResolver resolver = context.getContentResolver();
192        resolver.delete(uri, null, null);
193    }


@/packages/providers/TelephonyProvider/src/com/android/providers/telephony/TelephonyProvider.java
@Override
public synchronized int delete(Uri url, String where, String[] whereArgs) {  //synchronized 同步锁
    ...
    switch (match)
    {
    ...
        case URL_RESTOREAPN: {
            count = 1;
            restoreDefaultAPN(subId);
            getContext().getContentResolver().notifyChange(
                    Uri.withAppendedPath(CONTENT_URI, "restore/subId/" + subId), null,
                    true, UserHandle.USER_ALL);
            break;
        }

    return count;
}

3.疑问解答

1.Q:为啥phone进程的binder显示是IDLE,而不是在delete?

A:dump anr trace时,phone进程dump晚了2s钟,此时delete已经完成,因此是空闲的。

4.解决方案

一般anr解决都是降低耗时操作,比如这里phone.delete耗时1s左右,是否可以优化---对比正常机器这个时间不算异常

耗时操作不放在主线程--把onclick里面的操作都放到其他线程---不过这里delete会有一个同步锁,就算开线程也不是很好

最终的解决是降低button的点击频率,比如必须执行完onclick内容才能再次点击

5.本博客的意思

不要犯一看trace就说这里有毛病的结论,比如这里不应该说settings的主线程blocked在ResetNetworkConfirm.restoreDefaultApn,就说这里有问题,然后看到binder对端是phone,就直接把此bug归为phone问题,然后丢过去,anr里面就有这种多次情况还有循环耗时的情况就,但是每次循环大概在0.5s,这种导致,还有就是前面的方法耗时4.8s,最后trace打印的LOG耗时才0.2s这种。

猜你喜欢

转载自blog.csdn.net/wd229047557/article/details/86563329
ANR