第一次分析系统问题(重启)-小结备忘

本次问题在kernel Log中没有找到任何有用的线索

1.在ZZ_INTERNAL文件中找到:

Kernel (KE),0,0,99,/data/core/,0,,KE at __schedule_bug+0x98/0xd8,Wed Dec  6 17:55:16 CST 2017,1

可以确定可能内核代码有关,即驱动代码

2.使用GAT打开mtklog\aee_exp_backup\db.fatal.00.KE\db.fatal.00.KE.dbg

需要LogViewer才能打开该文件,打开Logviewer的方法(两种)如下:

*.打开GAT工具 -> Window -> Open LogViewer

*.运行脚本:GAT工具目录\tools\MediatekLogView.bat

运行GAT :jvm.dll cann't find 的问题,可能是版本不兼容的问题,就可能GAT是32位的,而jdk是64位的。必须要同时是32位或64位

打开db.fatal.00.KE.dbg,只有几个文件:

ZZ_INTERNAL

SYS_KERNEL_LOG

SYS_VERSION_INFO

__exp_main.txt

在__exp_main.txt得到以下信息:需要有经验的人才能判断出问题。

Exception Class: Kernel (KE)
PC is at [<ffffffc0000baefc>] __schedule_bug+0x98/0xd8
LR is at [<ffffffc0000baef4>] __schedule_bug+0x90/0xd8

Current Executing Process:
[d.process.media, 1068][main, 214]

Backtrace:
[<ffffffc00093a728>] __do_kernel_fault.part.5+0x70/0x84
[<ffffffc000091fd4>] do_page_fault+0x250/0x254
[<ffffffc000092094>] do_translation_fault+0xbc/0xf0    
[<ffffffc000081290>] do_mem_abort+0x38/0x9c    
[<ffffffc000083c50>] el1_da+0x14/0x80  
[<ffffffc000941054>] __schedule+0x63c/0x70c    
[<ffffffc000941148>] schedule+0x24/0x74
[<ffffffc0009411a8>] schedule_preempt_disabled+0x10/0x24       
[<ffffffc0000d9474>] mutex_optimistic_spin+0x184/0x1c8
[<ffffffc000942910>] __mutex_lock_slowpath+0x38/0x164  
[<ffffffc000942a80>] mutex_lock+0x44/0x64      
[<ffffffc000321bd4>] pinctrl_get_device_gpio_range+0x4c/0xf8   
[<ffffffc000321c9c>] pinctrl_gpio_direction+0x1c/0x8c  
[<ffffffc000323120>] pinctrl_gpio_direction_input+0xc/0x18     
[<ffffffc000327720>] mtk_gpio_direction_input+0x10/0x1c
[<ffffffc00032b780>] gpiod_direction_input+0x44/0x18c  
[<ffffffc0005c279c>] i2c_transfer_byte.isra.2+0xa4/0x118       
[<ffffffc0005c2c28>] i2c_write_buf.constprop.9+0x34/0x78       
[<ffffffc0005c2ddc>] led_time_hrtimer_func+0x60/0x108  
[<ffffffc0000f35a4>] __run_hrtimer+0x78/0x260  
[<ffffffc0000f4318>] hrtimer_interrupt+0x108/0x2a0     
[<ffffffc0006e1488>] arch_timer_handler_phys+0x28/0x38
[<ffffffc0000e72b8>] handle_percpu_devid_irq+0xa4/0x1a8
[<ffffffc0000e354c>] generic_handle_irq+0x30/0x4c      
[<ffffffc0000e35ec>] __handle_domain_irq+0x84/0xf0     
[<ffffffc000081418>] gic_handle_irq+0x34/0x80
 

在SYS_KERNEL_LOG没有找到有用的信息

在SYS_VERSION_INFO没有找到有用的信息

在mtklog\mobilelog\APLog_2017_1206_175544\last_kmsg中找到比较直接的提示

[ 2133.862246] -(2)[1068:d.process.media]BUG: scheduling while atomic: d.process.media/1068/0x00010001================================================================================================
[ 2133.863368] Preemption disabled at:[<ffffffc0000e35c4>] __handle_domain_irq+0x5c/0xf0
[ 2133.864344] -(2)[1068:d.process.media]
[ 2133.864815] -(2)[1068:d.process.media]CPU: 2 PID: 1068 Comm: d.process.media Tainted: G        W      3.18.22+ #1
[ 2133.866091] -(2)[1068:d.process.media]Hardware name: MT8163 (DT)
[ 2133.866839] -(2)[1068:d.process.media]Call trace:
[ 2133.867431] -(2)[1068:d.process.media][<ffffffc000087c28>] dump_backtrace+0x0/0x124
[ 2133.868382] -(2)[1068:d.process.media][<ffffffc000087d5c>] show_stack+0x10/0x1c
[ 2133.869294] -(2)[1068:d.process.media][<ffffffc00093ba40>] dump_stack+0x74/0xb8
[ 2133.870205] -(2)[1068:d.process.media][<ffffffc0000baec0>] __schedule_bug+0x5c/0xd8
[ 2133.871159] -(2)[1068:d.process.media][<ffffffc000941054>] __schedule+0x63c/0x70c
[ 2133.872092] -(2)[1068:d.process.media][<ffffffc000941148>] schedule+0x24/0x74
[ 2133.872981] -(2)[1068:d.process.media][<ffffffc0009411a8>] schedule_preempt_disabled+0x10/0x24
[ 2133.874055] -(2)[1068:d.process.media][<ffffffc0000d9474>] mutex_optimistic_spin+0x184/0x1c8
[ 2133.875106] -(2)[1068:d.process.media][<ffffffc000942910>] __mutex_lock_slowpath+0x38/0x164
[ 2133.876146] -(2)[1068:d.process.media][<ffffffc000942a80>] mutex_lock+0x44/0x64
[ 2133.877059] -(2)[1068:d.process.media][<ffffffc000321bd4>] pinctrl_get_device_gpio_range+0x4c/0xf8
[ 2133.878176] -(2)[1068:d.process.media][<ffffffc000321c9c>] pinctrl_gpio_direction+0x1c/0x8c
[ 2133.879217] -(2)[1068:d.process.media][<ffffffc000323120>] pinctrl_gpio_direction_input+0xc/0x18
[ 2133.880314] -(2)[1068:d.process.media][<ffffffc000327720>] mtk_gpio_direction_input+0x10/0x1c
[ 2133.881375] -(2)[1068:d.process.media][<ffffffc00032b780>] gpiod_direction_input+0x44/0x18c
[ 2133.882418] -(2)[1068:d.process.media][<ffffffc0005c279c>] i2c_transfer_byte.isra.2+0xa4/0x118
[ 2133.883491] -(2)[1068:d.process.media][<ffffffc0005c2c28>] i2c_write_buf.constprop.9+0x34/0x78
[ 2133.884565] -(2)[1068:d.process.media][<ffffffc0005c2ddc>] led_time_hrtimer_func+0x60/0x108
[ 2133.885608] -(2)[1068:d.process.media][<ffffffc0000f35a4>] __run_hrtimer+0x78/0x260
[ 2133.886561] -(2)[1068:d.process.media][<ffffffc0000f4318>] hrtimer_interrupt+0x108/0x2a0
[ 2133.887572] -(2)[1068:d.process.media][<ffffffc0006e1488>] arch_timer_handler_phys+0x28/0x38
[ 2133.888624] -(2)[1068:d.process.media][<ffffffc0000e72b8>] handle_percpu_devid_irq+0xa4/0x1a8
[ 2133.889685] -(2)[1068:d.process.media][<ffffffc0000e354c>] generic_handle_irq+0x30/0x4c
[ 2133.890682] -(2)[1068:d.process.media][<ffffffc0000e35ec>] __handle_domain_irq+0x84/0xf0
[ 2133.891690] -(2)[1068:d.process.media][<ffffffc000081418>] gic_handle_irq+0x34/0x80
[ 2133.892643] -(2)[1068:d.process.media]Exception stack(0xffffffc04c5ebeb0 to 0xffffffc04c5ebfd0)




[ 2133.910735] -(2)[1068:d.process.media]Internal error: Oops: 96000046 [#1] PREEMPT SMP=====================================================================================================
[ 2133.911711] disable aee kernel api
[ 2133.912116] -(2)[1068:d.process.media]CPU: 2 PID: 1068 Comm: d.process.media Tainted: G        W      3.18.22+ #1
[ 2133.913413] -(2)[1068:d.process.media]Hardware name: MT8163 (DT)
[ 2133.914164] -(2)[1068:d.process.media]task: ffffffc04c6aae00 ti: ffffffc04c5e8000 task.ti: ffffffc04c5e8000
[ 2133.915383] -(2)[1068:d.process.media]PC is at __schedule_bug+0x98/0xd8======================================================================
[ 2133.916204] -(2)[1068:d.process.media]LR is at __schedule_bug+0x90/0xd8======================================================================

[ 2133.917028] -(2)[1068:d.process.media]pc : [<ffffffc0000baefc>] lr : [<ffffffc0000baef4>] pstate: 800001c5
[ 2133.918229] -(2)[1068:d.process.media]sp : ffffffc04c5eb9d0




[ 2134.074609] -(2)[1068:d.process.media]Exception stack(0xffffffc04c5eb810 to 0xffffffc04c5eb930)
[ 2134.075696] -(2)[1068:d.process.media]b800:                                     7e791800 ffffffc0 4c5e8000 ffffffc0
[ 2134.076998] -(2)[1068:d.process.media]b820: 4c5eb9d0 ffffffc0 000baefc ffffffc0 000161f1 00000000 000e1370 ffffffc0
[ 2134.078299] -(2)[1068:d.process.media]b840: 4c5eb8a0 ffffffc0 000e1940 ffffffc0 00dc5000 ffffffc0 00000001 00000000
[ 2134.079601] -(2)[1068:d.process.media]b860: 00cef000 ffffffc0 00000000 00000000 000001c0 00000000 0000005c 00000000
[ 2134.080902] -(2)[1068:d.process.media]b880: 00000000 00000000 00000076 00000000 4c5eb8a0 ffffffc0 000e1950 ffffffc0
[ 2134.082204] -(2)[1068:d.process.media]b8a0: 4c5eb940 ffffffc0 0093aa04 ffffffc0 0000dead 00000000 00000aee 00000000
[ 2134.083505] -(2)[1068:d.process.media]b8c0: 00010001 00000000 00000002 00000000 00000001 00000000 00dc2be0 ffffffc0
[ 2134.084807] -(2)[1068:d.process.media]b8e0: 0000d124 00000000 00000086 00000000 000161f1 00000000 00002aa2 00000000
[ 2134.086109] -(2)[1068:d.process.media]b900: 756c6961 61206572 682f2074 2f656d6f 65646f63 38746d2f 5f333631 2f302e33
[ 2134.087409] -(2)[1068:d.process.media]b920: 6e72656b 332d6c65 00000020 00000000
[ 2134.088321] -(2)[1068:d.process.media][<ffffffc000083c50>] el1_da+0x14/0x80
[ 2134.089190] -(2)[1068:d.process.media][<ffffffc000941054>] __schedule+0x63c/0x70c
[ 2134.090122] -(2)[1068:d.process.media][<ffffffc000941148>] schedule+0x24/0x74
[ 2134.091011] -(2)[1068:d.process.media][<ffffffc0009411a8>] schedule_preempt_disabled+0x10/0x24
[ 2134.092084] -(2)[1068:d.process.media][<ffffffc0000d9474>] mutex_optimistic_spin+0x184/0x1c8
[ 2134.093136] -(2)[1068:d.process.media][<ffffffc000942910>] __mutex_lock_slowpath+0x38/0x164
[ 2134.094177] -(2)[1068:d.process.media][<ffffffc000942a80>] mutex_lock+0x44/0x64
[ 2134.095090] -(2)[1068:d.process.media][<ffffffc000321bd4>] pinctrl_get_device_gpio_range+0x4c/0xf8
[ 2134.096207] -(2)[1068:d.process.media][<ffffffc000321c9c>] pinctrl_gpio_direction+0x1c/0x8c
[ 2134.097248] -(2)[1068:d.process.media][<ffffffc000323120>] pinctrl_gpio_direction_input+0xc/0x18
[ 2134.098345] -(2)[1068:d.process.media][<ffffffc000327720>] mtk_gpio_direction_input+0x10/0x1c
[ 2134.099406] -(2)[1068:d.process.media][<ffffffc00032b780>] gpiod_direction_input+0x44/0x18c
[ 2134.100448] -(2)[1068:d.process.media][<ffffffc0005c279c>] i2c_transfer_byte.isra.2+0xa4/0x118
[ 2134.101521] -(2)[1068:d.process.media][<ffffffc0005c2c28>] i2c_write_buf.constprop.9+0x34/0x78
[ 2134.102595] -(2)[1068:d.process.media][<ffffffc0005c2ddc>] led_time_hrtimer_func+0x60/0x108
[ 2134.103637] -(2)[1068:d.process.media][<ffffffc0000f35a4>] __run_hrtimer+0x78/0x260
[ 2134.104591] -(2)[1068:d.process.media][<ffffffc0000f4318>] hrtimer_interrupt+0x108/0x2a0
[ 2134.105602] -(2)[1068:d.process.media][<ffffffc0006e1488>] arch_timer_handler_phys+0x28/0x38
[ 2134.106653] -(2)[1068:d.process.media][<ffffffc0000e72b8>] handle_percpu_devid_irq+0xa4/0x1a8
[ 2134.107715] -(2)[1068:d.process.media][<ffffffc0000e354c>] generic_handle_irq+0x30/0x4c
[ 2134.108712] -(2)[1068:d.process.media][<ffffffc0000e35ec>] __handle_domain_irq+0x84/0xf0
[ 2134.109721] -(2)[1068:d.process.media][<ffffffc000081418>] gic_handle_irq+0x34/0x80





[ 2134.126112] -(2)[1068:d.process.media]has_mt_dump_support: no mt_dump support!======================================================================================================
[ 2134.127011] -(1)[1709:Thread-1365]CPU1: stopping
[ 2134.127585] -(1)[1709:Thread-1365]CPU: 1 PID: 1709 Comm: Thread-1365 Tainted: G        W      3.18.22+ #1
[ 2134.128774] -(1)[1709:Thread-1365]Hardware name: MT8163 (DT)
[ 2134.129479] -(1)[1709:Thread-1365]Call trace:
[ 2134.130029] -(1)[1709:Thread-1365][<ffffffc000087c28>] dump_backtrace+0x0/0x124
[ 2134.130935] -(1)[1709:Thread-1365][<ffffffc000087d5c>] show_stack+0x10/0x1c
[ 2134.131804] -(1)[1709:Thread-1365][<ffffffc00093ba40>] dump_stack+0x74/0xb8
[ 2134.132670] -(1)[1709:Thread-1365][<ffffffc00008fb90>] handle_IPI+0x278/0x288
[ 2134.133559] -(1)[1709:Thread-1365][<ffffffc00008145c>] gic_handle_irq+0x78/0x80
[ 2134.134469] -(1)[1709:Thread-1365]Exception stack(0xffffffc03766feb0 to 0xffffffc03766ffd0)





[ 2134.176268] -(0)[0:swapper/0]Call trace:
[ 2134.176760] -(0)[0:swapper/0][<ffffffc000087c28>] dump_backtrace+0x0/0x124
[ 2134.177616] -(0)[0:swapper/0][<ffffffc000087d5c>] show_stack+0x10/0x1c
[ 2134.178430] -(0)[0:swapper/0][<ffffffc00093ba40>] dump_stack+0x74/0xb8
[ 2134.179243] -(0)[0:swapper/0][<ffffffc00008fb90>] handle_IPI+0x278/0x288
[ 2134.180077] -(0)[0:swapper/0][<ffffffc00008145c>] gic_handle_irq+0x78/0x80
[ 2134.180934] -(0)[0:swapper/0]Exception stack(0xffffffc000cd7d60 to 0xffffffc000cd7e80)
[ 2134.192757] -(0)[0:swapper/0][<ffffffc000083de8>] el1_irq+0x68/0xdc
[ 2134.193539] -(0)[0:swapper/0][<ffffffc0006c6cd8>] cpuidle_enter+0x14/0x20
[ 2134.194385] -(0)[0:swapper/0][<ffffffc0000d21a0>] cpu_startup_entry+0x258/0x300
[ 2134.195295] -(0)[0:swapper/0][<ffffffc00093783c>] rest_init+0x80/0x8c
[ 2134.196100] -(0)[0:swapper/0][<ffffffc000c5f96c>] start_kernel+0x380/0x398

[ 2134.196954] -(2)[1068:d.process.media]machine_restart, arm_pm_restart(          (null))
[ 2134.197951] -(2)[1068:d.process.media]ARCH_RESET happen!!!========================================================================================================================================
[ 2134.198634] -(2)[1068:d.process.media]arch_reset: cmd = NULL
[ 2134.199341] -(2)[1068:d.process.media]CPU: 2 PID: 1068 Comm: d.process.media Tainted: G        W      3.18.22+ #1
[ 2134.200618] -(2)[1068:d.process.media]Hardware name: MT8163 (DT)
[ 2134.201367] -(2)[1068:d.process.media]Call trace:
[ 2134.201956] -(2)[1068:d.process.media][<ffffffc000087c28>] dump_backtrace+0x0/0x124
[ 2134.202909] -(2)[1068:d.process.media][<ffffffc000087d5c>] show_stack+0x10/0x1c
[ 2134.203820] -(2)[1068:d.process.media][<ffffffc00093ba40>] dump_stack+0x74/0xb8
[ 2134.204732] -(2)[1068:d.process.media][<ffffffc0006a83c4>] arch_reset+0xec/0x118
[ 2134.205653] -(2)[1068:d.process.media][<ffffffc0006a841c>] mtk_arch_reset_handle+0x2c/0x48
[ 2134.206684] -(2)[1068:d.process.media][<ffffffc0000b5874>] notifier_call_chain+0x68/0x11c
[ 2134.207703] -(2)[1068:d.process.media][<ffffffc0000b5ab4>] atomic_notifier_call_chain+0x30/0x48
[ 2134.208788] -(2)[1068:d.process.media][<ffffffc0000b6d5c>] do_kernel_restart+0x1c/0x28
[ 2134.209775] -(2)[1068:d.process.media][<ffffffc0000853fc>] machine_restart+0x5c/0x64
[ 2134.210740] -(2)[1068:d.process.media][<ffffffc0000b6c70>] emergency_restart+0x14/0x20
[ 2134.211728] -(2)[1068:d.process.media][<ffffffc00059a728>] ipanic_die+0x54/0xb8
[ 2134.212638] -(2)[1068:d.process.media][<ffffffc0000b5874>] notifier_call_chain+0x68/0x11c
[ 2134.213657] -(2)[1068:d.process.media][<ffffffc0000b5ab4>] atomic_notifier_call_chain+0x30/0x48
[ 2134.214742] -(2)[1068:d.process.media][<ffffffc0000b6100>] notify_die+0x30/0x3c
[ 2134.215653] -(2)[1068:d.process.media][<ffffffc000087dfc>] die+0x94/0x1a4
[ 2134.216500] -(2)[1068:d.process.media][<ffffffc00093a728>] __do_kernel_fault.part.5+0x70/0x84
[ 2134.217563] -(2)[1068:d.process.media][<ffffffc000091fd4>] do_page_fault+0x250/0x254
[ 2134.218528] -(2)[1068:d.process.media][<ffffffc000092094>] do_translation_fault+0xbc/0xf0
[ 2134.219547] -(2)[1068:d.process.media][<ffffffc000081290>] do_mem_abort+0x38/0x9c
[ 2134.220479] -(2)[1068:d.process.media]Exception stack(0xffffffc04c5eb810 to 0xffffffc04c5eb930)
[ 2134.234189] -(2)[1068:d.process.media][<ffffffc000083c50>] el1_da+0x14/0x80
[ 2134.235058] -(2)[1068:d.process.media][<ffffffc000941054>] __schedule+0x63c/0x70c
[ 2134.235990] -(2)[1068:d.process.media][<ffffffc000941148>] schedule+0x24/0x74
[ 2134.236880] -(2)[1068:d.process.media][<ffffffc0009411a8>] schedule_preempt_disabled+0x10/0x24
[ 2134.237954] -(2)[1068:d.process.media][<ffffffc0000d9474>] mutex_optimistic_spin+0x184/0x1c8
[ 2134.239005] -(2)[1068:d.process.media][<ffffffc000942910>] __mutex_lock_slowpath+0x38/0x164
[ 2134.240046] -(2)[1068:d.process.media][<ffffffc000942a80>] mutex_lock+0x44/0x64
[ 2134.240959] -(2)[1068:d.process.media][<ffffffc000321bd4>] pinctrl_get_device_gpio_range+0x4c/0xf8
[ 2134.242075] -(2)[1068:d.process.media][<ffffffc000321c9c>] pinctrl_gpio_direction+0x1c/0x8c
[ 2134.243117] -(2)[1068:d.process.media][<ffffffc000323120>] pinctrl_gpio_direction_input+0xc/0x18
[ 2134.244213] -(2)[1068:d.process.media][<ffffffc000327720>] mtk_gpio_direction_input+0x10/0x1c
[ 2134.245274] -(2)[1068:d.process.media][<ffffffc00032b780>] gpiod_direction_input+0x44/0x18c
[ 2134.246316] -(2)[1068:d.process.media][<ffffffc0005c279c>] i2c_transfer_byte.isra.2+0xa4/0x118
[ 2134.247389] -(2)[1068:d.process.media][<ffffffc0005c2c28>] i2c_write_buf.constprop.9+0x34/0x78
[ 2134.248463] -(2)[1068:d.process.media][<ffffffc0005c2ddc>] led_time_hrtimer_func+0x60/0x108
[ 2134.249505] -(2)[1068:d.process.media][<ffffffc0000f35a4>] __run_hrtimer+0x78/0x260
[ 2134.250459] -(2)[1068:d.process.media][<ffffffc0000f4318>] hrtimer_interrupt+0x108/0x2a0
[ 2134.251469] -(2)[1068:d.process.media][<ffffffc0006e1488>] arch_timer_handler_phys+0x28/0x38
[ 2134.252521] -(2)[1068:d.process.media][<ffffffc0000e72b8>] handle_percpu_devid_irq+0xa4/0x1a8
[ 2134.253583] -(2)[1068:d.process.media][<ffffffc0000e354c>] generic_handle_irq+0x30/0x4c
[ 2134.254580] -(2)[1068:d.process.media][<ffffffc0000e35ec>] __handle_domain_irq+0x84/0xf0
[ 2134.255589] -(2)[1068:d.process.media][<ffffffc000081418>] gic_handle_irq+0x34/0x80
[ 2134.256542] -(2)[1068:d.process.media]Exception stack(0xffffffc04c5ebeb0 to 0xffffffc04c5ebfd0)

[ 2133.910735] -(2)[1068:d.process.media]Internal error: Oops: 96000046 [#1] PREEMPT SMP

这是提示抢占的错误,跟抢占调度有关。

BUG: scheduling while atomic: d.process.media/1068

这是明显提示:当进行原子操作时,执行了调度程序。而linux规定进行原子操作时不允许进行调度算法。、

原子操作在这里应该指的是需要一直占用cpu直到完成的代码块。

而看这个back trace可以看到栈底应该是一个IRQ中断函数(gic_handle_irq),而栈中有调用mutex_lock函数,

这是一个互斥锁的代码(mutex_optimistic_spin,这是自旋锁),互斥锁,就是当锁被持有时就需要进入锁队列,

并进入休眠状态,就是从running状态转换为sleep状态,就是会放弃cpu,然后执行调度算法选择一个进程去运行。

可以看到,mutex_lock中调用了schedule,肯定是跟调度有关的函数。最终调用了__schedule_bug,即出现了调度

异常。

[ 2133.870205] -(2)[1068:d.process.media][<ffffffc0000baec0>] __schedule_bug+0x5c/0xd8
[ 2133.871159] -(2)[1068:d.process.media][<ffffffc000941054>] __schedule+0x63c/0x70c
[ 2133.872092] -(2)[1068:d.process.media][<ffffffc000941148>] schedule+0x24/0x74
[ 2133.872981] -(2)[1068:d.process.media][<ffffffc0009411a8>] schedule_preempt_disabled+0x10/0x24
[ 2133.874055] -(2)[1068:d.process.media][<ffffffc0000d9474>] mutex_optimistic_spin+0x184/0x1c8
[ 2133.875106] -(2)[1068:d.process.media][<ffffffc000942910>] __mutex_lock_slowpath+0x38/0x164
[ 2133.876146] -(2)[1068:d.process.media][<ffffffc000942a80>] mutex_lock+0x44/0x64

猜你喜欢

转载自blog.csdn.net/b1480521874/article/details/78835250