How to debug u-boot data abort

Heinrich Schuchardt xypron.glpk at gmx.de
Wed Mar 23 09:27:08 CET 2022


On 3/23/22 08:45, qianfan wrote:
>
> 在 2022/3/23 10:28, qianfan 写道:
>>
>> Hi:
>>
>> I had a custom AM335X board connected my computer by usbnet. It always
>> report data abort when 'dhcp':
>>
>> Next it the log:
>>
>> U-Boot 2022.01-rc1-00183-gfa5b4e2d19-dirty (Feb 25 2022 - 15:45:02 +0800)
>>
>> CPU  : AM335X-GP rev 2.1
>> Model: WISDOM AM335X CCT
>> DRAM:  512 MiB
>> NAND:  256 MiB
>> MMC:   OMAP SD/MMC: 0
>> Loading Environment from NAND... *** Warning - bad CRC, using default
>> environment
>>
>> Net:   Could not get PHY for ethernet at 4a100000: addr 0
>> eth2: ethernet at 4a100000, eth3: usb_ether
>> Hit any key to stop autoboot:  0
>> => setenv autoload no
>> => dhcp
>> using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in
>> MAC de:ad:be:ef:00:01
>> HOST MAC de:ad:be:ef:00:00
>> RNDIS ready
>> musb-hdrc: peripheral reset irq lost!
>> high speed config #2: 2 mA, Ethernet Gadget, using RNDIS
>> USB RNDIS network up!
>> BOOTP broadcast 1
>> BOOTP broadcast 2
>> BOOTP broadcast 3
>> DHCP client bound to address 192.168.200.4 (757 ms)
>> data abort

This could be an alignment error.

>> pc : [<9fe9b0a2>]          lr : [<9febbc3f>]
>> reloc pc : [<808130a2>]    lr : [<80833c3f>]

You can use these addresses together with the u-boot.map file to figure
out in which function the abort occurs and from where it was called.

Use 'arm-linux-gnueabihf-objdump -S -D' to find the exact code positions.

>> sp : 9de53410  ip : 9de53578     fp : 00000001
>> r10: 9de5345c  r9 : 9de67e80     r8 : 9febbae5
>> r7 : 9de72c30  r6 : 9feec710     r5 : 0000000d  r4 : 00000018
>> r3 : 3fdd8e04  r2 : 00000002     r1 : 9feec728  r0 : 9feec700
>> Flags: Nzcv  IRQs off  FIQs on  Mode SVC_32 (T)
>> Code: f023 0303 60ca 4403 (6091) 685a

This is how to find the exact instruction causing the problem:

$ echo 'Code: f023 0303 60ca 4403 (6091) 685a' | \
 >       ARCH=arm scripts/decodecode
Code: f023 0303 60ca 4403 (6091) 685a
All code
========
    0:   23 f0                   and    %eax,%esi
    2:   03 03                   add    (%rbx),%eax
    4:   ca 60 03                lret   $0x360
    7:*  44 91                   rex.R xchg %eax,%ecx            <--
trapping instruction
    9:   60                      (bad)
    a:   5a                      pop    %rdx
    b:   68                      .byte 0x68

Code starting with the faulting instruction
===========================================
    0:   91                      xchg   %eax,%ecx
    1:   60                      (bad)
    2:   5a                      pop    %rdx
    3:   68                      .byte 0x68

I hope this helps to figure out, where exactly the problem occurs

Best regards

Heinrich

>> Resetting CPU ...
>>
>> resetting ...
>>
>>
>> It's there has any doc about how to debug data abort? Or is the bug is
>> already fixed?
>>
>> Thanks
>>
> This bug doesn't fixed on master code. I found v2021.01 is good and
> v2021.04-rc2 is bad.
>
> Also I had tested this on beaglebone black with am335x_evm_defconfig,
> has the simliar problem.
>
> find the first bug commit via 'git bisect': it told me that commit
> e97eb638de0dc8f6e989e20eaeb0342f103cb917 broke it. But it is very
> strange due to this commit doesn't touch any dhcp or network code.
>
> ➜  u-boot-main git:(e97eb638de) ✗ git bisect bug
> e97eb638de0dc8f6e989e20eaeb0342f103cb917 is the first bug commit
> commit e97eb638de0dc8f6e989e20eaeb0342f103cb917
> Author: Heinrich Schuchardt <xypron.glpk at gmx.de>
> Date:   Wed Jan 20 22:21:53 2021 +0100
>
>      fs: fat: consistent error handling for flush_dir()
>
>      Provide function description for flush_dir().
>      Move all error messages for flush_dir() from the callers to the
> function.
>      Move mapping of errors to -EIO to the function.
>      Always check return value of flush_dir() (Coverity CID 316362).
>
>      In fat_unlink() return -EIO if flush_dirty_fat_buffer() fails.
>
>      Signed-off-by: Heinrich Schuchardt <xypron.glpk at gmx.de>
>
> :040000 040000 2281a449f2d134078d7faa1ee735a367b55aad7e
> 77d188b1c99181fd71f2167fdeee3434a09db209 M      fs
>
>
> 184aa6504143b452132e28cd3ebecc7b941cdfa1 is the first commit before
> e97eb638de0dc8f6e989e20eaeb0342f103cb917:
>
> * e97eb638de0dc8f6e989e20eaeb0342f103cb917 fs: fat: consistent error
> handling for flush_dir()
> *   184aa6504143b452132e28cd3ebecc7b941cdfa1 Merge tag
> 'u-boot-rockchip-20210121' of
> https://gitlab.denx.de/u-boot/custodians/u-boot-rockchip
> |\
> | * 9ddc0787bd660214366e386ce689dd78299ac9d0 pci: Add Rockchip dwc based
> PCIe controller driver
>
> I checked 184aa6504143b452132e28cd3ebecc7b941cdfa1 can work fine.
>
> U-Boot 2021.01-00688-g184aa65041-dirty (Mar 23 2022 - 15:07:56 +0800)
>
> CPU  : AM335X-GP rev 2.1
> Model: TI AM335x BeagleBone Black
> DRAM:  512 MiB
> WDT:   Started with servicing (60s timeout)
> NAND:  0 MiB
> MMC:   OMAP SD/MMC: 0, OMAP SD/MMC: 1
> Loading Environment from FAT... <ethaddr> not set. Validating first
> E-fuse MAC
> Net:   eth2: ethernet at 4a100000, eth3: usb_ether
> Hit any key to stop autoboot:  0
> => dhcp
> ethernet at 4a100000 Waiting for PHY auto negotiation to complete.........
> TIMEOUT !
> using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in
> MAC de:ad:be:ef:00:01
> HOST MAC de:ad:be:ef:00:00
> RNDIS ready
> musb-hdrc: peripheral reset irq lost!
> high speed config #2: 2 mA, Ethernet Gadget, using RNDIS
> USB RNDIS network up!
> BOOTP broadcast 1
> BOOTP broadcast 2
> BOOTP broadcast 3
> DHCP client bound to address 192.168.200.157 (757 ms)
> Using usb_ether device
> TFTP from server 192.168.200.1; our IP address is 192.168.200.157
> Filename 'u-boot.img'.
> Load address: 0x82000000
> Loading: #################################################################
> #################################################################
> #################################################################
>           #########################
>           2.5 MiB/s
> done
> Bytes transferred = 1123888 (112630 hex)
> =>
>



More information about the U-Boot mailing list