根据linux Oops定位错误代码行

11384阅读 1评论2012-05-24 zhuqing_739
分类:LINUX

这几天一直在调试atmel at91sam9x25的串口,用着用着总会导致Oops,Oops内容如下:

[ 1023.510000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[ 1023.520000] pgd = c0004000
[ 1023.520000] [00000000] *pgd=00000000
[ 1023.520000] Internal error: Oops: 17 [#1]
[ 1023.520000] last sysfs file: /sys/devices/virtual/misc/at91flash/dev
[ 1023.520000] Modules linked in: at91flash at91gpio at91mc323 ds18b20 at91adc
[ 1023.520000] CPU: 0    Tainted: G        W    (2.6.39 #35)
[ 1023.520000] PC is at atmel_tasklet_func+0x104/0x690
[ 1023.520000] LR is at atmel_tasklet_func+0x10/0x690

[ 1023.520000] pc : []    lr : []    psr: 20000013
[ 1023.520000] sp : c7825f58  ip : 60000093  fp : 00000000
[ 1023.520000] r10: 00000006  r9 : 00000000  r8 : 0000000a
[ 1023.520000] r7 : 00000000  r6 : c7824000  r5 : c78a2484  r4 : c03c0cb8
[ 1023.520000] r3 : 0000004c  r2 : 0000004c  r1 : 60000013  r0 : 00000001
[ 1023.520000] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
[ 1023.520000] Control: 0005317f  Table: 27b40000  DAC: 00000017
[ 1023.520000] Process ksoftirqd/0 (pid: 3, stack limit = 0xc7824270)
[ 1023.520000] Stack: (0xc7825f58 to 0xc7826000)
[ 1023.520000] 5f40:                                                       00000001 c7824000
[ 1023.520000] 5f60: 00000100 0000000a 00000000 00000006 c7825f8c 00000000 00000001 c7824000
[ 1023.520000] 5f80: 00000100 0000000a 00000006 c0045cf8 c03b995c c00461d8 c7aa6ae0 00000000
[ 1023.520000] 5fa0: 60000093 00000000 c7824000 c0046274 00000013 00000000 00000000 c00462e0
[ 1023.520000] 5fc0: 00000000 c7819f70 00000000 c00570e0 00000000 00000000 00000000 00000000
[ 1023.520000] 5fe0: c7825fe0 c7825fe0 c7819f70 c0057060 c0030b14 c0030b14 ffffffff ffffffff
[ 1023.520000] [] (atmel_tasklet_func+0x104/0x690) from [] (tasklet_action+0x84/0xe8)
[ 1023.520000] [] (tasklet_action+0x84/0xe8) from [] (__do_softirq+0x88/0x124)
[ 1023.520000] [] (__do_softirq+0x88/0x124) from [] (run_ksoftirqd+0x6c/0x128)
[ 1023.520000] [] (run_ksoftirqd+0x6c/0x128) from [] (kthread+0x80/0x88)
[ 1023.520000] [] (kthread+0x80/0x88) from [] (kernel_thread_exit+0x0/0x8)
[ 1023.520000] Code: 1a000002 e59f057c e59f157c ebfa3d49 (e5973000) 
[ 1023.710000] ---[ end trace 786b41cd25d3b661 ]---
[ 1023.710000] Kernel panic - not syncing: Fatal exception in interrupt
[ 1023.720000] [] (unwind_backtrace+0x0/0xe0) from [] (panic+0x50/0x170)
[ 1023.720000] [] (panic+0x50/0x170) from [] (die+0x184/0x1c4)
[ 1023.730000] [] (die+0x184/0x1c4) from [] (__do_kernel_fault+0x64/0x84)
[ 1023.740000] [] (__do_kernel_fault+0x64/0x84) from [] (do_page_fault+0x1b4/0x1c8)
[ 1023.750000] [] (do_page_fault+0x1b4/0x1c8) from [] (do_DataAbort+0x30/0x98)
[ 1023.760000] [] (do_DataAbort+0x30/0x98) from [] (__dabt_svc+0x4c/0x60)
[ 1023.770000] Exception stack(0xc7825f10 to 0xc7825f58)
[ 1023.770000] 5f00:                                     00000001 60000013 0000004c 0000004c
[ 1023.780000] 5f20: c03c0cb8 c78a2484 c7824000 00000000 0000000a 00000000 00000006 00000000
[ 1023.790000] 5f40: 60000093 c7825f58 c01a32b8 c01a33ac 20000013 ffffffff
[ 1023.790000] [] (__dabt_svc+0x4c/0x60) from [] (atmel_tasklet_func+0x104/0x690)
[ 1023.800000] [] (atmel_tasklet_func+0x104/0x690) from [] (tasklet_action+0x84/0xe8)
[ 1023.810000] [] (tasklet_action+0x84/0xe8) from [] (__do_softirq+0x88/0x124)
[ 1023.820000] [] (__do_softirq+0x88/0x124) from [] (run_ksoftirqd+0x6c/0x128)
[ 1023.830000] [] (run_ksoftirqd+0x6c/0x128) from [] (kthread+0x80/0x88)
[ 1023.840000] [] (kthread+0x80/0x88) from [] (kernel_thread_exit+0x0/0x8)

注意上述红色的地方。

下面就来显示如何定位出出错代码行:

1.首先,编译时打开complie with debug info选项,步则如下

make ARCH=arm CROSS_COMPILE=arm-none-linux-gnueabi- menuconfig

进入 Kernel hacking

选择 Compile the kernel with debug info

然后,保存,退出。

接着 make ARCH=arm CROSS_COMPILE=arm-none-linux-gnueabi-

编译, 等编译完成。

2.利用arm-none-linux-gnueabi-gdb 调试,如下:

arm-none-linux-gnueabi-gdb vmlinux


对应着Oops 消息里面的这一行

[ 1023.520000] LR is at atmel_tasklet_func+0x10/0x690

在gdb下键入命令 : l *atmel_tasklet_func+0x10(注意:这里的‘l’是字母“L”,由于字体的原因看起来像‘1’)


这样就找到了出错的代码行。在这里鄙视一下atmel提供的内核,竟然还有bug!

从这里可以看出是由于串口的dma导致Oops的,于是我去掉了串口的dma传输。方法如下:


去掉之后还没有发现上述的Oops出现。


上一篇:没有了
下一篇:内核proc文件系统与seq接口(1)---内核proc文件系统简介

文章评论