Boot regression on am335x-hs-evm

Andrew Davis afd at ti.com
Mon Jun 13 16:56:29 CEST 2022


On 6/13/22 9:20 AM, Tom Rini wrote:
> On Mon, Jun 13, 2022 at 02:51:03PM +0200, LABBE Corentin wrote:
>> Le Fri, Jun 10, 2022 at 11:48:34AM -0400, Tom Rini a écrit :
>>> On Fri, Jun 10, 2022 at 05:45:04PM +0200, LABBE Corentin wrote:
>>>> Le Fri, Jun 10, 2022 at 11:01:47AM -0400, Tom Rini a écrit :
>>>>> On Fri, Jun 10, 2022 at 04:51:12PM +0200, LABBE Corentin wrote:
>>>>>> Le Fri, Jun 10, 2022 at 08:16:10AM -0400, Tom Rini a écrit :
>>>>>>> On Fri, Jun 10, 2022 at 11:59:23AM +0200, LABBE Corentin wrote:
>>>>>>>> Hello
>>>>>>>>
>>>>>>>> I hit a boot regression on am335x-hs-evm.
>>>>>>>> On current uboot, the board does not boot at all.
>>>>>>>> This board uses both MLO and u-boot.img and only MLO was the problem.
>>>>>>>>
>>>>>>>> After a bisect, I found that e41651fffda7 ("dm: Support parent devices with of-platdata") was the problem.
>>>>>>>> Reverting this patch lead to a success boot.
>>>>>>>>
>>>>>>>> I cutdown the revert to a minimal fix:
>>>>>>>> --- a/drivers/core/lists.c
>>>>>>>> +++ b/drivers/core/lists.c
>>>>>>>> @@ -120,6 +120,7 @@ int lists_bind_drivers(struct udevice *parent, bool pre_reloc_only)
>>>>>>>>                  int ret;
>>>>>>>>   
>>>>>>>>                  ret = bind_drivers_pass(parent, pre_reloc_only);
>>>>>>>> +               return ret;
>>>>>>>>                  if (!ret)
>>>>>>>>                          break;
>>>>>>>>                  if (ret != -EAGAIN && !result)
>>>>>>>>
>>>>>>>> I cannot debug further since printf() is not working at this stage.
>>>>>>>>
>>>>>>>> Since I wanted to know which error was badly handled, I tried to do this:
>>>>>>>> --- a/arch/arm/mach-omap2/sec-common.c
>>>>>>>> +++ b/arch/arm/mach-omap2/sec-common.c
>>>>>>>> @@ -111,6 +111,8 @@ static u32 find_sig_start(char *image, size_t size)
>>>>>>>>          return 0;
>>>>>>>>   }
>>>>>>>>   
>>>>>>>> +extern int errorcount;
>>>>>>>> +
>>>>>>>>   int secure_boot_verify_image(void **image, size_t *size)
>>>>>>>>   {
>>>>>>>>          int result = 1;
>>>>>>>> @@ -178,6 +180,7 @@ auth_exit:
>>>>>>>>           * via YMODEM. This is done to avoid disturbing the YMODEM serial
>>>>>>>>           * protocol transactions.
>>>>>>>>           */
>>>>>>>> +       printf("ERRORCOUNT %d\n", errorcount);
>>>>>>>>          if (!(IS_ENABLED(CONFIG_SPL_BUILD) &&
>>>>>>>>                IS_ENABLED(CONFIG_SPL_YMODEM_SUPPORT) &&
>>>>>>>>                spl_boot_device() == BOOT_DEVICE_UART))
>>>>>>>> --- a/drivers/core/lists.c
>>>>>>>> +++ b/drivers/core/lists.c
>>>>>>>> @@ -20,6 +20,10 @@
>>>>>>>>   #include <fdtdec.h>
>>>>>>>>   #include <linux/compiler.h>
>>>>>>>>   
>>>>>>>> +static int _errorcount;
>>>>>>>> +int errorlist[1024];
>>>>>>>> +int errorcount;
>>>>>>>> +
>>>>>>>>   struct driver *lists_driver_lookup_name(const char *name)
>>>>>>>>   {
>>>>>>>>          struct driver *drv =
>>>>>>>> @@ -120,8 +124,9 @@ int lists_bind_drivers(struct udevice *parent, bool pre_reloc_only)
>>>>>>>>                  int ret;
>>>>>>>>   
>>>>>>>>                  ret = bind_drivers_pass(parent, pre_reloc_only);
>>>>>>>> -               if (!ret)
>>>>>>>> -                       break;
>>>>>>>> +               errorlist[_errorcount] = ret;
>>>>>>>> +               _errorcount++;
>>>>>>>> +               errorcount = _errorcount;
>>>>>>>>                  if (ret != -EAGAIN && !result)
>>>>>>>>                          result = ret;
>>>>>>>>          }
>>>>>>>>
>>>>>>>> But errorcount is always 0 which is puzzling me since according to my think, lists_bind_drivers() is ran before secure_boot_verify_image().
>>>>>>>>
>>>>>>>> Any idea on how to debug further ?
>>>>>>>
>>>>>>> You should be able to enable DEBUG_UART and get output that way.  But
>>>>>>> it's likely something related to the space constraints of the HS chip
>>>>>>> rather than GP.
>>>>>>>
>>>>>>
>>>>>> Hello
>>>>>>
>>>>>> Thanks for your suggestion, I successfully got futher with:
>>>>>> diff --git a/drivers/core/lists.c b/drivers/core/lists.c
>>>>>> index b23ee3030e..415ba814f1 100644
>>>>>> --- a/drivers/core/lists.c
>>>>>> +++ b/drivers/core/lists.c
>>>>>> @@ -111,6 +111,8 @@ int lists_bind_drivers(struct udevice *parent, bool pre_reloc_only)
>>>>>>          int result = 0;
>>>>>>          int pass;
>>>>>>   
>>>>>> +       debug_uart_init();
>>>>>> +
>>>>>>          /*
>>>>>>           * 10 passes is 10 levels deep in the devicetree, which is plenty. If
>>>>>>           * OF_PLATDATA_PARENT is not enabled, then bind_drivers_pass() will
>>>>>> diff --git a/drivers/serial/Kconfig b/drivers/serial/Kconfig
>>>>>> index b4805a2e4e..7ab059b4ea 100644
>>>>>> --- a/drivers/serial/Kconfig
>>>>>> +++ b/drivers/serial/Kconfig
>>>>>> @@ -158,6 +158,7 @@ config TPL_DM_SERIAL
>>>>>>   
>>>>>>   config DEBUG_UART
>>>>>>          bool "Enable an early debug UART for debugging"
>>>>>> +       default y
>>>>>>          help
>>>>>>            The debug UART is intended for use very early in U-Boot to debug
>>>>>>            problems when an ICE or other debug mechanism is not available.
>>>>>> @@ -185,7 +186,7 @@ config DEBUG_UART
>>>>>>   choice
>>>>>>          prompt "Select which UART will provide the debug UART"
>>>>>>          depends on DEBUG_UART
>>>>>> -       default DEBUG_UART_NS16550
>>>>>> +       default DEBUG_UART_OMAP
>>>>>>   
>>>>>>   config DEBUG_UART_ALTERA_JTAGUART
>>>>>>          bool "Altera JTAG UART"
>>>>>> @@ -406,7 +407,7 @@ endchoice
>>>>>>   config DEBUG_UART_BASE
>>>>>>          hex "Base address of UART"
>>>>>>          depends on DEBUG_UART
>>>>>> -       default 0 if DEBUG_UART_SANDBOX
>>>>>> +       default 0x44e09000
>>>>>>          help
>>>>>>            This is the base address of your UART for memory-mapped UARTs.
>>>>>>   
>>>>>> @@ -416,7 +417,7 @@ config DEBUG_UART_BASE
>>>>>>   config DEBUG_UART_CLOCK
>>>>>>          int "UART input clock"
>>>>>>          depends on DEBUG_UART
>>>>>> -       default 0 if DEBUG_UART_SANDBOX
>>>>>> +       default 48000000
>>>>>>          help
>>>>>>            The UART input clock determines the speed of the internal UART
>>>>>>            circuitry. The baud rate is derived from this by dividing the input
>>>>>> @@ -428,7 +429,7 @@ config DEBUG_UART_CLOCK
>>>>>>   config DEBUG_UART_SHIFT
>>>>>>          int "UART register shift"
>>>>>>          depends on DEBUG_UART
>>>>>> -       default 0 if DEBUG_UART
>>>>>> +       default 2
>>>>>>          help
>>>>>>            Some UARTs (notably ns16550) support different register layouts
>>>>>>            where the registers are spaced either as bytes, words or some other
>>>>>> @@ -449,6 +450,7 @@ config DEBUG_UART_BOARD_INIT
>>>>>>   config DEBUG_UART_ANNOUNCE
>>>>>>          bool "Show a message when the debug UART starts up"
>>>>>>          depends on DEBUG_UART
>>>>>> +       default y
>>>>>>          help
>>>>>>            Enable this option to show a message when the debug UART is ready
>>>>>>            for use. You will see a message like "<debug_uart> " as soon as
>>>>>>
>>>>>> I got:
>>>>>> <debug_uart>-
>>>>>> <debug_uart>
>>>>>> alloc space exhausted
>>>>>
>>>>> Looks like you need to increase SPL_SYS_MALLOC_F_LEN
>>>>
>>>> Thanks
>>>> I have increased it by step of 0x1000, the number of "alloc space exhausted" was reducing, but now I have none of them (with 0x6000 or 0x7000) but still no boot.
>>>
>>> Given the reduced resources of the HS parts, you might need to double
>>> check that everything you're doing will fit in the available space and
>>> perhaps temporarily disable some other features, to ensure things are
>>> small enough.  You should also be able to get more info out of
>>> DEBUG_UART, as that's working.
>>>
>>
>> For a successfull boot, both SPL_SYS_MALLOC_F_LEN should be increased and the number of lists_bind_drivers() pass need to be reduced.
>>
>> I have added some debug and it seems that my problem is eth_cpsw fault.
>> bind_drivers_pass eth_cpsw ret=-2
>> bind_drivers_pass omap_hsmmc ret=0
>> bind_drivers_pass omap_hsmmc ret=0
>> bind_drivers_pass gpio_omap ret=0
>> bind_drivers_pass gpio_omap ret=0
>> bind_drivers_pass gpio_omap ret=0
>> bind_drivers_pass gpio_omap ret=0
>> bind_drivers_pass i2c_omap ret=0
>> bind_drivers_pass i2c_omap ret=0
>> bind_drivers_pass i2c_omap ret=0
>> bind_drivers_pass ns16550_serial ret=0
>> bind_drivers_pass ns16550_serial ret=0
>> bind_drivers_pass ns16550_serial ret=0
>> bind_drivers_pass ns16550_serial ret=0
>> bind_drivers_pass ns16550_serial ret=0
>> bind_drivers_pass ns16550_serial ret=0
>> lists_bind_drivers ret=-2
>>
>> While my first idea was to reduce the number of pass via a CONFIG, finally just removing eth_cpsw from SPL did the trick.
>> diff --git a/board/ti/am335x/board.c b/board/ti/am335x/board.c
>> index 4dc04817dc..5e664d7bbd 100644
>> --- a/board/ti/am335x/board.c
>> +++ b/board/ti/am335x/board.c
>> @@ -945,11 +945,13 @@ struct eth_pdata cpsw_pdata = {
>>   	.priv_pdata = &am335_eth_data,
>>   };
>>   
>> +#ifndef CONFIG_SPL_BUILD
>>   U_BOOT_DEVICE(am335x_eth) = {
>>   	.name = "eth_cpsw",
>>   	.platdata = &cpsw_pdata,
>>   };
>>   #endif
>> +#endif
>>   
>>   #ifdef CONFIG_SPL_LOAD_FIT
>>   int board_fit_config_name_match(const char *name)
>>
>> What do you thinkg about it ?
>> Does it is ok do to that ? Since SPL does not need ethernet anyway.


SPL does need Ethernet in the non-HS cases. As pointed out below
CONFIG_IS_ENABLED is better here as the HS defconfigs do not
define net support, so it will only be removed for HS.

Little background here, HS parts have about half the available SRAM
compared to the already limited non-HS AM335x (the missing SRAM is
used and firewalled off by the SecureROM, that is not present in
the non-HS devices). This makes for one of the most constrained SPLs
we deal with. Net and USB support simply don't fit at all in SPL
(the stack for each alone breaks our SRAM our budget).
So we only support two modes on HS AM335x by default, MMC/SD and
UART (more can be made to work if needed by a customer, but they
are a bit hacky so we don't upstream their support).

am335x_hs_evm_defconfig      - eMMC/SD
am335x_hs_evm_uart_defconfig - UART

So feel free to chop anything that is not related to these two
boot modes from the HS configs to keep the size down. Sometimes things
creep in over the releases and we run-out of space as you have
found here :(

Thanks,
Andrew


> 
> You can make the whole section there depend on:
> CONFIG_IS_ENABLED(NET) && !CONFIG_IS_ENABLED(OF_CONTROL)
> to remove that part from SPL.  I think that's reasonably clean.
> 


More information about the U-Boot mailing list