data abort when run 'dhcp'

qianfan qianfanguijin at 163.com
Wed Mar 23 11:07:24 CET 2022


在 2022/3/23 17:51, Heinrich Schuchardt 写道:
> On 3/23/22 10:13, qianfan wrote:
>>
>> 在 2022/3/23 16:02, qianfan 写道:
>>>
>>>
>>> 在 2022/3/23 15:45, qianfan 写道:
>>>>
>>>>
>>>> 在 2022/3/23 10:28, qianfan 写道:
>>>>>
>>>>> Hi:
>>>>>
>>>>> I had a custom AM335X board connected my computer by usbnet. It
>>>>> always report data abort when 'dhcp':
>>>>>
>>>>> Next it the log:
>>>>>
>>>>> U-Boot 2022.01-rc1-00183-gfa5b4e2d19-dirty (Feb 25 2022 - 15:45:02
>>>>> +0800)
>>>>>
>>>>> CPU  : AM335X-GP rev 2.1
>>>>> Model: WISDOM AM335X CCT
>>>>> DRAM:  512 MiB
>>>>> NAND:  256 MiB
>>>>> MMC:   OMAP SD/MMC: 0
>>>>> Loading Environment from NAND... *** Warning - bad CRC, using
>>>>> default environment
>>>>>
>>>>> Net:   Could not get PHY for ethernet at 4a100000: addr 0
>>>>> eth2: ethernet at 4a100000, eth3: usb_ether
>>>>> Hit any key to stop autoboot:  0
>>>>> => setenv autoload no
>>>>> => dhcp
>>>>> using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in
>>>>> MAC de:ad:be:ef:00:01
>>>>> HOST MAC de:ad:be:ef:00:00
>>>>> RNDIS ready
>>>>> musb-hdrc: peripheral reset irq lost!
>>>>> high speed config #2: 2 mA, Ethernet Gadget, using RNDIS
>>>>> USB RNDIS network up!
>>>>> BOOTP broadcast 1
>>>>> BOOTP broadcast 2
>>>>> BOOTP broadcast 3
>>>>> DHCP client bound to address 192.168.200.4 (757 ms)
>>>>> data abort
>>>>> pc : [<9fe9b0a2>]          lr : [<9febbc3f>]
>>>>> reloc pc : [<808130a2>]    lr : [<80833c3f>]
>>>>> sp : 9de53410  ip : 9de53578     fp : 00000001
>>>>> r10: 9de5345c  r9 : 9de67e80     r8 : 9febbae5
>>>>> r7 : 9de72c30  r6 : 9feec710     r5 : 0000000d  r4 : 00000018
>>>>> r3 : 3fdd8e04  r2 : 00000002     r1 : 9feec728  r0 : 9feec700
>>>>> Flags: Nzcv  IRQs off  FIQs on  Mode SVC_32 (T)
>>>>> Code: f023 0303 60ca 4403 (6091) 685a
>>>>> Resetting CPU ...
>>>>>
>>>>> resetting ...
>>>>>
>>>>>
>>>>> It's there has any doc about how to debug data abort? Or is the bug
>>>>> is already fixed?
>>>>>
>>>>> Thanks
>>>>>
>>>> This bug doesn't fixed on master code. I found v2021.01 is good and
>>>> v2021.04-rc2 is bad.
>>>>
>>>> Also I had tested this on beaglebone black with am335x_evm_defconfig,
>>>> has the simliar problem.
>>>>
>>>> find the first bug commit via 'git bisect': it told me that commit
>>>> e97eb638de0dc8f6e989e20eaeb0342f103cb917 broke it. But it is very
>>>> strange due to this commit doesn't touch any dhcp or network code.
>>>>
>>>> ➜  u-boot-main git:(e97eb638de) ✗ git bisect bug
>>>> e97eb638de0dc8f6e989e20eaeb0342f103cb917 is the first bug commit
>>>> commit e97eb638de0dc8f6e989e20eaeb0342f103cb917
>>>> Author: Heinrich Schuchardt <xypron.glpk at gmx.de>
>>>> Date:   Wed Jan 20 22:21:53 2021 +0100
>>>>
>>>>     fs: fat: consistent error handling for flush_dir()
>>>>
>>>>     Provide function description for flush_dir().
>>>>     Move all error messages for flush_dir() from the callers to the
>>>> function.
>>>>     Move mapping of errors to -EIO to the function.
>>>>     Always check return value of flush_dir() (Coverity CID 316362).
>>>>
>>>>     In fat_unlink() return -EIO if flush_dirty_fat_buffer() fails.
>>>>
>>>>     Signed-off-by: Heinrich Schuchardt <xypron.glpk at gmx.de>
>>>>
>>>> :040000 040000 2281a449f2d134078d7faa1ee735a367b55aad7e
>>>> 77d188b1c99181fd71f2167fdeee3434a09db209 M      fs
>>>>
>>>>
>>>> 184aa6504143b452132e28cd3ebecc7b941cdfa1 is the first commit before
>>>> e97eb638de0dc8f6e989e20eaeb0342f103cb917:
>>>>
>>>> * e97eb638de0dc8f6e989e20eaeb0342f103cb917 fs: fat: consistent error
>>>> handling for flush_dir()
>>>> *   184aa6504143b452132e28cd3ebecc7b941cdfa1 Merge tag
>>>> 'u-boot-rockchip-20210121' of
>>>> https://gitlab.denx.de/u-boot/custodians/u-boot-rockchip
>>>> |\
>>>> | * 9ddc0787bd660214366e386ce689dd78299ac9d0 pci: Add Rockchip dwc
>>>> based PCIe controller driver
>>>>
>>>> I checked 184aa6504143b452132e28cd3ebecc7b941cdfa1 can work fine.
>>>>
>>>> U-Boot 2021.01-00688-g184aa65041-dirty (Mar 23 2022 - 15:07:56 +0800)
>>>>
>>>> CPU  : AM335X-GP rev 2.1
>>>> Model: TI AM335x BeagleBone Black
>>>> DRAM:  512 MiB
>>>> WDT:   Started with servicing (60s timeout)
>>>> NAND:  0 MiB
>>>> MMC:   OMAP SD/MMC: 0, OMAP SD/MMC: 1
>>>> Loading Environment from FAT... <ethaddr> not set. Validating first
>>>> E-fuse MAC
>>>> Net:   eth2: ethernet at 4a100000, eth3: usb_ether
>>>> Hit any key to stop autoboot:  0
>>>> => dhcp
>>>> ethernet at 4a100000 Waiting for PHY auto negotiation to
>>>> complete......... TIMEOUT !
>>>> using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in
>>>> MAC de:ad:be:ef:00:01
>>>> HOST MAC de:ad:be:ef:00:00
>>>> RNDIS ready
>>>> musb-hdrc: peripheral reset irq lost!
>>>> high speed config #2: 2 mA, Ethernet Gadget, using RNDIS
>>>> USB RNDIS network up!
>>>> BOOTP broadcast 1
>>>> BOOTP broadcast 2
>>>> BOOTP broadcast 3
>>>> DHCP client bound to address 192.168.200.157 (757 ms)
>>>> Using usb_ether device
>>>> TFTP from server 192.168.200.1; our IP address is 192.168.200.157
>>>> Filename 'u-boot.img'.
>>>> Load address: 0x82000000
>>>> Loading:
>>>> #################################################################
>>>> #################################################################
>>>> #################################################################
>>>>          #########################
>>>>          2.5 MiB/s
>>>> done
>>>> Bytes transferred = 1123888 (112630 hex)
>>>> =>
>>>>
>> "data abort" messages:
>>
>> data abort
>> pc : [<9ff8196c>]          lr : [<9ffa1cd7>]
>> reloc pc : [<8081496c>]    lr : [<80834cd7>]
>> sp : 9df38e60  ip : 9df38fc8     fp : 00000001
>> r10: 9df38eac  r9 : 9df4ceb0     r8 : 9ffa1b7d
>> r7 : 9df52fd0  r6 : 9ffdbba8     r5 : 0000000d  r4 : 00000018
>> r3 : 3ff589e0  r2 : 9ffafa11     r1 : 9ffdbbc0  r0 : 9ffdbb00
>> Flags: Nzcv  IRQs off  FIQs on  Mode SVC_32 (T)
>> Code: 0303 60ca 4403 6091 (685a) f042
>> Resetting CPU ...
>>
>> objdump u-boot:pc is in malloc and lr is in env_attr_walk
>>
>>        unlink(victim, bck, fwd);
>> 80814966:    60ca          str    r2, [r1, #12]
>>        set_inuse_bit_at_offset(victim, victim_size);
>> 80814968:    4403          add    r3, r0
>>        unlink(victim, bck, fwd);
>> 8081496a:    6091          str    r1, [r2, #8]
>>      set_inuse_bit_at_offset(victim, victim_size);
>> 8081496c:    685a          ldr    r2, [r3, #4]
>> 8081496e:    f042 0201     orr.w    r2, r2, #1
>> 80814972:    605a          str    r2, [r3, #4]
>>
>> r3 is 3ff589e0 and it's not a valid ram address on am335x.
>>
>>
>
> I have seen crashes in common/dlmalloc.c before after double free() or
> free() with an incorrect pointer.
>
> The assert() statements in do_check_inuse_chunk() are meant to catch
> this but assert() as defined in include/log.h does not stop the code and
> even does not print without _DEBUG=1.
>
> You should be able to get the assert output with
>
> #include <common.h>
> #define _DEBUG 1
> #include <log.h>
>
> at the top of common/dlmalloc.c.
>
> You should get full malloc debug output with

Hi: I had try add DEBUG marco before <log.h> and no other malloc message printed.

>
> #define DEBUG 1
> #include <common.h>
> #include <log.h>
>
> Best regards
>
> Heinrich



More information about the U-Boot mailing list