[PATCH v3 07/18] pxe: Move pxe_utils files

Adam Ford aford173 at gmail.com
Fri Feb 11 17:39:30 CET 2022


On Fri, Feb 11, 2022 at 10:12 AM Tom Rini <trini at konsulko.com> wrote:
>
> On Fri, Feb 11, 2022 at 09:50:32AM -0600, Adam Ford wrote:
> > On Thu, Feb 10, 2022 at 8:57 AM Tom Rini <trini at konsulko.com> wrote:
> > >
> > > On Thu, Feb 10, 2022 at 07:56:52AM -0600, Adam Ford wrote:
> > > > On Wed, Feb 9, 2022 at 11:16 AM Simon Glass <sjg at chromium.org> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > On Wed, 9 Feb 2022 at 05:32, Tom Rini <trini at konsulko.com> wrote:
> > > > > >
> > > > > > On Wed, Feb 09, 2022 at 05:40:03AM -0600, Adam Ford wrote:
> > > > > > > On Thu, Oct 14, 2021 at 1:50 PM Simon Glass <sjg at chromium.org> wrote:
> > > > > > > >
> > > > > > > > Move the header file into the main include/ directory so we can use it
> > > > > > > > from the bootmethod code. Move the C file into boot/ since it relates to
> > > > > > > > booting.
> > > > > > > >
> > > > > > > +cc lokeshvutla at ti.com
> > > > > > >
> > > > > > > Simon,
> > > > > > >
> > > > > > > I can't explain why, but with git bisect, it appears this patch breaks
> > > > > > > my omap3_logic board (DM3730) by making it wrongly think there is 4GB
> > > > > > > of RAM, when in reality there is only 256MB.  We have both 256MB and
> > > > > > > 512MB parts, and the automatic memory detection has always 'just
> > > > > > > worked' in the past.
> > > > > > >
> > > > > > > With this patch now, I see:
> > > > > > > U-Boot 2022.01-rc1-00185-g262cfb5b15 (Feb 09 2022 - 05:23:42 -0600)
> > > > > > >
> > > > > > > OMAP3630/3730-GP ES1.2, CPU-OPP2, L3-200MHz, Max CPU Clock 1 GHz
> > > > > > > Model: LogicPD Zoom DM3730 Torpedo + Wireless Development Kit
> > > > > > > DRAM:  4 GiB
> > > > > > > <hang>
> > > > > > >
> > > > > > > With the previous commit, 8018b9af57b5 ("pxe: Tidy up the is_pxe
> > > > > > > global"), it properly detects the RAM and fully boots.
> > > > > > >
> > > > > > > U-Boot 2022.01-rc1-00184-g8018b9af57 (Feb 09 2022 - 05:21:39 -0600)
> > > > > > >
> > > > > > > OMAP3630/3730-GP ES1.2, CPU-OPP2, L3-200MHz, Max CPU Clock 1 GHz
> > > > > > > Model: LogicPD Zoom DM3730 Torpedo + Wireless Development Kit
> > > > > > > DRAM:  256 MiB
> > > > > > > NAND:  512 MiB
> > > > > > > MMC:   OMAP SD/MMC: 0
> > > > > > > Loading Environment from NAND... OK
> > > > > > > OMAP die ID: 619e00029ff800000168300f1502501f
> > > > > > > Net:   eth0: ethernet at 08000000
> > > > > > > Hit any key to stop autoboot:  0
> > > > > > > OMAP Logic #
> > > > > > >
> > > > > > > I have CONFIG_CMD_BOOTM,  CONFIG_CMD_PXE and CONFIG_CMD_SYSBOOT all
> > > > > > > defined, so I am having a hard time understanding why this would
> > > > > > > change behavior or stomp on the the structure that knows the memory
> > > > > > > size.
> > > > > > >
> > > > > > > If I jump ahead to the current 'master' 531c0089457:("Merge branch
> > > > > > > '2022-02-08-TI-platform-updates')  and revert this patch, my board
> > > > > > > boots correctly again, but I am struggling to understand why.
> > > > + Marek Behún
> > > >
> > > > > > >
> > > > > > > Do you have any suggestions for me to try?
> > > > > >
> > > > > > I would suggest objdump disassemble U-Boot before/after and see what
> > > > > > functions have changed.
> > > > >
> > > > > Keep an eye out for a BSS variable that is used before relocation, perhaps?
> > > >
> > > > I am still investigating, but disabling LTO appears to fix the issue
> > > > for me.  I'd like to keep LTO, so I'm going to attempt to focus on the
> > > > differences in the affected functions and how this patch makes LTO
> > > > behave differently.
> > > >
> > > > The disassembly of U-Boot is large, so it's going to take me a bit of
> > > > time to investigate.  If someone has any LTO-related suggestions that
> > > > I could try, I'd be open to try them too.
> > >
> > > Wait, the disassembly is large, or the differences between the
> > > disassembly, before/after this change alone, are large?  It's feeling
> >
> > I will be the first to admit thatI am not very good with the assembly
> > side of things, but this is what I did:
> >
> > git checkout master
> > make CROSS_COMPILE=arm-linux-gnueabihf- -j8
> > arm-linux-gnueabihf-objdump -S u-boot > broken.dump
> > git revert 262cfb5b15420a1aea465745a821e684b3dfa153
> > make CROSS_COMPILE=arm-linux-gnueabihf- -j8
> > arm-linux-gnueabihf-objdump -S u-boot > working.dump
> >
> > diff --side-by-side --suppress-common-lines broken.dump working.dump
> > > broken-working.diff
> > cat -n broken-working.diff
> >
> > The broken-working.diff file with common lines suppressed is 236256 lines long.
>
> OK, I just use '-d' and not '-S', which might help a little bit.  But
> you're probably going to still need to edit the dumps and just globally
> change all of the addresses to 'XXXXXXXX' so that you'll end up
> hopefully only seeing where functions were optimized differently.  But
> it might well end up being a bit trickier than that.

It looks like none of the object files are showing any content with
objdump when LTO is enabled.  With a little google search, it appears
we need lto-dump.  I have some more meetings, but I'll try to spend
some more time on it this weekend.

>
> > When I disable LTO for just pxe_utils.o and redo the same exercise,
> > the diff file with common-lines removed is 266573 lines long.
> >
> > Maybe I am not using objdump correctly.  I am not all that familiar
> > with this code either, so I am not sure which variables should be in
> > BSS.  I did a search in both working and non-working dumps to look for
> > keyworks like BSS, but from what I can tell,  both have similar
> > functions:
> >
> > gd->mon_len = (ulong)&__bss_end - (ulong)_start;
> > /* TODO: use (ulong)&__bss_end - (ulong)&__text_start; ? */
> > gd->mon_len = (ulong)&__bss_end - CONFIG_SYS_MONITOR_BASE;
> > gd->mon_len = (ulong)&__bss_end - (ulong)_start;
> > * reserve memory for U-Boot code, data & bss
> > 8011051a <clear_bss>:
> > #if defined(CONFIG_SPL_BUILD) && defined(CONFIG_SPL_EARLY_BSS)
> > CLEAR_BSS
> > #if !defined(CONFIG_SPL_BUILD) || !defined(CONFIG_SPL_EARLY_BSS)
> > CLEAR_BSS
> > CLEAR_BSS
> >
> > When I grepped for mon_len, both sets of dumps looked nearly identical:
> >
> > aford at aford-OptiPlex-7050:~/src/u-boot$ grep mon_len working.dump
> > lmb_reserve(lmb, (phys_addr_t)(uintptr_t)_start, gd->mon_len);
> > 80112724 <setup_mon_len>:
> > static int setup_mon_len(void)
> > gd->mon_len = (ulong)&__bss_end - (ulong)_start;
> > 80112726: 4903      ldr r1, [pc, #12] ; (80112734 <setup_mon_len+0x10>)
> > 80112728: 4b03      ldr r3, [pc, #12] ; (80112738 <setup_mon_len+0x14>)
> > gd->mon_len = (ulong)&__bss_end - CONFIG_SYS_MONITOR_BASE;
> > gd->mon_len = (ulong)&__bss_end - (ulong)_start;
> > gd->ram_top = board_get_usable_ram_top(gd->mon_len);
> > gd->relocaddr -= gd->mon_len;
> >       gd->mon_len >> 10, gd->relocaddr);
> >     ip = mon_lengths[yleap];
> >
> >
> > aford at aford-OptiPlex-7050:~/src/u-boot$ grep mon_len broken.dump
> > lmb_reserve(lmb, (phys_addr_t)(uintptr_t)_start, gd->mon_len);
> > 80110398 <setup_mon_len>:
> > static int setup_mon_len(void)
> > gd->mon_len = (ulong)&__bss_end - (ulong)_start;
> > 8011039a: 4903      ldr r1, [pc, #12] ; (801103a8 <setup_mon_len+0x10>)
> > 8011039c: 4b03      ldr r3, [pc, #12] ; (801103ac <setup_mon_len+0x14>)
> > gd->mon_len = (ulong)&__bss_end - CONFIG_SYS_MONITOR_BASE;
> > gd->mon_len = (ulong)&__bss_end - (ulong)_start;
> > gd->ram_top = board_get_usable_ram_top(gd->mon_len);
> > gd->relocaddr -= gd->mon_len;
> >       gd->mon_len >> 10, gd->relocaddr);
> >     ip = mon_lengths[yleap];
> > aford at aford-OptiPlex-7050:~/src/u-boot$
> >
> > Since I think I narrowed it down to the pxe_utils.o file, I thought
> > I'd do an objdump of both the working and non-working versions of
> > pxe_utils.o and this is where it got interesting.
> >
> > With LTO building pxe_utils.o, the dump looks empty:
> >
> > arm-linux-gnueabihf-objdump -S boot/pxe_utils.o > pxe-notworking.dump
> > cat pxe-notworking.dump
> >
> > boot/pxe_utils.o:     file format elf32-littlearm
> >
> > ^--  no actual code dump
> > If I take the working version of this same file without LTO enabled
> > and do a dump, and it's 2291 lines long and full of functions.
> >
> > I tried adding some __used to the non-static function names, but that
> > didn't appear to make any difference to the objdump of pxe_utils.o
>
> I feel like it can't be pxe_utils.o itself but rather how LTO is
> behaving before/after that change and sorting the object files
> differently.  If modifying the dumps like I suggested above doesn't lead

That's what I was thinking too.

> to more clues, and it doesn't seem to matter what toolchain is used (are
> you using the gcc-11 from kernel.org that we use in docker and
> buildman?), I'll try and look as well.

I am using GCC 11, but I'm using the version that come with Ubuntu 21.10:

Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.2.0 (Ubuntu 11.2.0-5ubuntu1)



>
> --
> Tom


More information about the U-Boot mailing list