data abort when run 'dhcp'
Heinrich Schuchardt
xypron.glpk at gmx.de
Wed Mar 23 11:12:16 CET 2022
On 3/23/22 11:07, qianfan wrote:
>
> 在 2022/3/23 17:51, Heinrich Schuchardt 写道:
>> On 3/23/22 10:13, qianfan wrote:
>>>
>>> 在 2022/3/23 16:02, qianfan 写道:
>>>>
>>>>
>>>> 在 2022/3/23 15:45, qianfan 写道:
>>>>>
>>>>>
>>>>> 在 2022/3/23 10:28, qianfan 写道:
>>>>>>
>>>>>> Hi:
>>>>>>
>>>>>> I had a custom AM335X board connected my computer by usbnet. It
>>>>>> always report data abort when 'dhcp':
>>>>>>
>>>>>> Next it the log:
>>>>>>
>>>>>> U-Boot 2022.01-rc1-00183-gfa5b4e2d19-dirty (Feb 25 2022 - 15:45:02
>>>>>> +0800)
>>>>>>
>>>>>> CPU : AM335X-GP rev 2.1
>>>>>> Model: WISDOM AM335X CCT
>>>>>> DRAM: 512 MiB
>>>>>> NAND: 256 MiB
>>>>>> MMC: OMAP SD/MMC: 0
>>>>>> Loading Environment from NAND... *** Warning - bad CRC, using
>>>>>> default environment
>>>>>>
>>>>>> Net: Could not get PHY for ethernet at 4a100000: addr 0
>>>>>> eth2: ethernet at 4a100000, eth3: usb_ether
>>>>>> Hit any key to stop autoboot: 0
>>>>>> => setenv autoload no
>>>>>> => dhcp
>>>>>> using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in
>>>>>> MAC de:ad:be:ef:00:01
>>>>>> HOST MAC de:ad:be:ef:00:00
>>>>>> RNDIS ready
>>>>>> musb-hdrc: peripheral reset irq lost!
>>>>>> high speed config #2: 2 mA, Ethernet Gadget, using RNDIS
>>>>>> USB RNDIS network up!
>>>>>> BOOTP broadcast 1
>>>>>> BOOTP broadcast 2
>>>>>> BOOTP broadcast 3
>>>>>> DHCP client bound to address 192.168.200.4 (757 ms)
>>>>>> data abort
>>>>>> pc : [<9fe9b0a2>] lr : [<9febbc3f>]
>>>>>> reloc pc : [<808130a2>] lr : [<80833c3f>]
>>>>>> sp : 9de53410 ip : 9de53578 fp : 00000001
>>>>>> r10: 9de5345c r9 : 9de67e80 r8 : 9febbae5
>>>>>> r7 : 9de72c30 r6 : 9feec710 r5 : 0000000d r4 : 00000018
>>>>>> r3 : 3fdd8e04 r2 : 00000002 r1 : 9feec728 r0 : 9feec700
>>>>>> Flags: Nzcv IRQs off FIQs on Mode SVC_32 (T)
>>>>>> Code: f023 0303 60ca 4403 (6091) 685a
>>>>>> Resetting CPU ...
>>>>>>
>>>>>> resetting ...
>>>>>>
>>>>>>
>>>>>> It's there has any doc about how to debug data abort? Or is the bug
>>>>>> is already fixed?
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>> This bug doesn't fixed on master code. I found v2021.01 is good and
>>>>> v2021.04-rc2 is bad.
>>>>>
>>>>> Also I had tested this on beaglebone black with am335x_evm_defconfig,
>>>>> has the simliar problem.
>>>>>
>>>>> find the first bug commit via 'git bisect': it told me that commit
>>>>> e97eb638de0dc8f6e989e20eaeb0342f103cb917 broke it. But it is very
>>>>> strange due to this commit doesn't touch any dhcp or network code.
>>>>>
>>>>> ➜ u-boot-main git:(e97eb638de) ✗ git bisect bug
>>>>> e97eb638de0dc8f6e989e20eaeb0342f103cb917 is the first bug commit
>>>>> commit e97eb638de0dc8f6e989e20eaeb0342f103cb917
>>>>> Author: Heinrich Schuchardt <xypron.glpk at gmx.de>
>>>>> Date: Wed Jan 20 22:21:53 2021 +0100
>>>>>
>>>>> fs: fat: consistent error handling for flush_dir()
>>>>>
>>>>> Provide function description for flush_dir().
>>>>> Move all error messages for flush_dir() from the callers to the
>>>>> function.
>>>>> Move mapping of errors to -EIO to the function.
>>>>> Always check return value of flush_dir() (Coverity CID 316362).
>>>>>
>>>>> In fat_unlink() return -EIO if flush_dirty_fat_buffer() fails.
>>>>>
>>>>> Signed-off-by: Heinrich Schuchardt <xypron.glpk at gmx.de>
>>>>>
>>>>> :040000 040000 2281a449f2d134078d7faa1ee735a367b55aad7e
>>>>> 77d188b1c99181fd71f2167fdeee3434a09db209 M fs
>>>>>
>>>>>
>>>>> 184aa6504143b452132e28cd3ebecc7b941cdfa1 is the first commit before
>>>>> e97eb638de0dc8f6e989e20eaeb0342f103cb917:
>>>>>
>>>>> * e97eb638de0dc8f6e989e20eaeb0342f103cb917 fs: fat: consistent error
>>>>> handling for flush_dir()
>>>>> * 184aa6504143b452132e28cd3ebecc7b941cdfa1 Merge tag
>>>>> 'u-boot-rockchip-20210121' of
>>>>> https://gitlab.denx.de/u-boot/custodians/u-boot-rockchip
>>>>> |\
>>>>> | * 9ddc0787bd660214366e386ce689dd78299ac9d0 pci: Add Rockchip dwc
>>>>> based PCIe controller driver
>>>>>
>>>>> I checked 184aa6504143b452132e28cd3ebecc7b941cdfa1 can work fine.
>>>>>
>>>>> U-Boot 2021.01-00688-g184aa65041-dirty (Mar 23 2022 - 15:07:56 +0800)
>>>>>
>>>>> CPU : AM335X-GP rev 2.1
>>>>> Model: TI AM335x BeagleBone Black
>>>>> DRAM: 512 MiB
>>>>> WDT: Started with servicing (60s timeout)
>>>>> NAND: 0 MiB
>>>>> MMC: OMAP SD/MMC: 0, OMAP SD/MMC: 1
>>>>> Loading Environment from FAT... <ethaddr> not set. Validating first
>>>>> E-fuse MAC
>>>>> Net: eth2: ethernet at 4a100000, eth3: usb_ether
>>>>> Hit any key to stop autoboot: 0
>>>>> => dhcp
>>>>> ethernet at 4a100000 Waiting for PHY auto negotiation to
>>>>> complete......... TIMEOUT !
>>>>> using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in
>>>>> MAC de:ad:be:ef:00:01
>>>>> HOST MAC de:ad:be:ef:00:00
>>>>> RNDIS ready
>>>>> musb-hdrc: peripheral reset irq lost!
>>>>> high speed config #2: 2 mA, Ethernet Gadget, using RNDIS
>>>>> USB RNDIS network up!
>>>>> BOOTP broadcast 1
>>>>> BOOTP broadcast 2
>>>>> BOOTP broadcast 3
>>>>> DHCP client bound to address 192.168.200.157 (757 ms)
>>>>> Using usb_ether device
>>>>> TFTP from server 192.168.200.1; our IP address is 192.168.200.157
>>>>> Filename 'u-boot.img'.
>>>>> Load address: 0x82000000
>>>>> Loading:
>>>>> #################################################################
>>>>> #################################################################
>>>>> #################################################################
>>>>> #########################
>>>>> 2.5 MiB/s
>>>>> done
>>>>> Bytes transferred = 1123888 (112630 hex)
>>>>> =>
>>>>>
>>> "data abort" messages:
>>>
>>> data abort
>>> pc : [<9ff8196c>] lr : [<9ffa1cd7>]
>>> reloc pc : [<8081496c>] lr : [<80834cd7>]
>>> sp : 9df38e60 ip : 9df38fc8 fp : 00000001
>>> r10: 9df38eac r9 : 9df4ceb0 r8 : 9ffa1b7d
>>> r7 : 9df52fd0 r6 : 9ffdbba8 r5 : 0000000d r4 : 00000018
>>> r3 : 3ff589e0 r2 : 9ffafa11 r1 : 9ffdbbc0 r0 : 9ffdbb00
>>> Flags: Nzcv IRQs off FIQs on Mode SVC_32 (T)
>>> Code: 0303 60ca 4403 6091 (685a) f042
>>> Resetting CPU ...
>>>
>>> objdump u-boot:pc is in malloc and lr is in env_attr_walk
>>>
>>> unlink(victim, bck, fwd);
>>> 80814966: 60ca str r2, [r1, #12]
>>> set_inuse_bit_at_offset(victim, victim_size);
>>> 80814968: 4403 add r3, r0
>>> unlink(victim, bck, fwd);
>>> 8081496a: 6091 str r1, [r2, #8]
>>> set_inuse_bit_at_offset(victim, victim_size);
>>> 8081496c: 685a ldr r2, [r3, #4]
>>> 8081496e: f042 0201 orr.w r2, r2, #1
>>> 80814972: 605a str r2, [r3, #4]
>>>
>>> r3 is 3ff589e0 and it's not a valid ram address on am335x.
>>>
>>>
>>
>> I have seen crashes in common/dlmalloc.c before after double free() or
>> free() with an incorrect pointer.
>>
>> The assert() statements in do_check_inuse_chunk() are meant to catch
>> this but assert() as defined in include/log.h does not stop the code and
>> even does not print without _DEBUG=1.
>>
>> You should be able to get the assert output with
>>
>> #include <common.h>
>> #define _DEBUG 1
>> #include <log.h>
>>
>> at the top of common/dlmalloc.c.
>>
>> You should get full malloc debug output with
>
> Hi: I had try add DEBUG marco before <log.h> and no other malloc message
assert() checks for _DEBUG. Defining DEBUG after common.h will not
define _DEBUG.
Best regards
Heinrich
> printed.
>
>>
>> #define DEBUG 1
>> #include <common.h>
>> #include <log.h>
>>
>> Best regards
>>
>> Heinrich
>
More information about the U-Boot
mailing list