[PATCH v3 07/18] pxe: Move pxe_utils files

Tom Rini trini at konsulko.com
Fri Feb 11 17:44:53 CET 2022


On Fri, Feb 11, 2022 at 10:39:30AM -0600, Adam Ford wrote:
> On Fri, Feb 11, 2022 at 10:12 AM Tom Rini <trini at konsulko.com> wrote:
> >
> > On Fri, Feb 11, 2022 at 09:50:32AM -0600, Adam Ford wrote:
> > > On Thu, Feb 10, 2022 at 8:57 AM Tom Rini <trini at konsulko.com> wrote:
> > > >
> > > > On Thu, Feb 10, 2022 at 07:56:52AM -0600, Adam Ford wrote:
> > > > > On Wed, Feb 9, 2022 at 11:16 AM Simon Glass <sjg at chromium.org> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > On Wed, 9 Feb 2022 at 05:32, Tom Rini <trini at konsulko.com> wrote:
> > > > > > >
> > > > > > > On Wed, Feb 09, 2022 at 05:40:03AM -0600, Adam Ford wrote:
> > > > > > > > On Thu, Oct 14, 2021 at 1:50 PM Simon Glass <sjg at chromium.org> wrote:
> > > > > > > > >
> > > > > > > > > Move the header file into the main include/ directory so we can use it
> > > > > > > > > from the bootmethod code. Move the C file into boot/ since it relates to
> > > > > > > > > booting.
> > > > > > > > >
> > > > > > > > +cc lokeshvutla at ti.com
> > > > > > > >
> > > > > > > > Simon,
> > > > > > > >
> > > > > > > > I can't explain why, but with git bisect, it appears this patch breaks
> > > > > > > > my omap3_logic board (DM3730) by making it wrongly think there is 4GB
> > > > > > > > of RAM, when in reality there is only 256MB.  We have both 256MB and
> > > > > > > > 512MB parts, and the automatic memory detection has always 'just
> > > > > > > > worked' in the past.
> > > > > > > >
> > > > > > > > With this patch now, I see:
> > > > > > > > U-Boot 2022.01-rc1-00185-g262cfb5b15 (Feb 09 2022 - 05:23:42 -0600)
> > > > > > > >
> > > > > > > > OMAP3630/3730-GP ES1.2, CPU-OPP2, L3-200MHz, Max CPU Clock 1 GHz
> > > > > > > > Model: LogicPD Zoom DM3730 Torpedo + Wireless Development Kit
> > > > > > > > DRAM:  4 GiB
> > > > > > > > <hang>
> > > > > > > >
> > > > > > > > With the previous commit, 8018b9af57b5 ("pxe: Tidy up the is_pxe
> > > > > > > > global"), it properly detects the RAM and fully boots.
> > > > > > > >
> > > > > > > > U-Boot 2022.01-rc1-00184-g8018b9af57 (Feb 09 2022 - 05:21:39 -0600)
> > > > > > > >
> > > > > > > > OMAP3630/3730-GP ES1.2, CPU-OPP2, L3-200MHz, Max CPU Clock 1 GHz
> > > > > > > > Model: LogicPD Zoom DM3730 Torpedo + Wireless Development Kit
> > > > > > > > DRAM:  256 MiB
> > > > > > > > NAND:  512 MiB
> > > > > > > > MMC:   OMAP SD/MMC: 0
> > > > > > > > Loading Environment from NAND... OK
> > > > > > > > OMAP die ID: 619e00029ff800000168300f1502501f
> > > > > > > > Net:   eth0: ethernet at 08000000
> > > > > > > > Hit any key to stop autoboot:  0
> > > > > > > > OMAP Logic #
> > > > > > > >
> > > > > > > > I have CONFIG_CMD_BOOTM,  CONFIG_CMD_PXE and CONFIG_CMD_SYSBOOT all
> > > > > > > > defined, so I am having a hard time understanding why this would
> > > > > > > > change behavior or stomp on the the structure that knows the memory
> > > > > > > > size.
> > > > > > > >
> > > > > > > > If I jump ahead to the current 'master' 531c0089457:("Merge branch
> > > > > > > > '2022-02-08-TI-platform-updates')  and revert this patch, my board
> > > > > > > > boots correctly again, but I am struggling to understand why.
> > > > > + Marek Behún
> > > > >
> > > > > > > >
> > > > > > > > Do you have any suggestions for me to try?
> > > > > > >
> > > > > > > I would suggest objdump disassemble U-Boot before/after and see what
> > > > > > > functions have changed.
> > > > > >
> > > > > > Keep an eye out for a BSS variable that is used before relocation, perhaps?
> > > > >
> > > > > I am still investigating, but disabling LTO appears to fix the issue
> > > > > for me.  I'd like to keep LTO, so I'm going to attempt to focus on the
> > > > > differences in the affected functions and how this patch makes LTO
> > > > > behave differently.
> > > > >
> > > > > The disassembly of U-Boot is large, so it's going to take me a bit of
> > > > > time to investigate.  If someone has any LTO-related suggestions that
> > > > > I could try, I'd be open to try them too.
> > > >
> > > > Wait, the disassembly is large, or the differences between the
> > > > disassembly, before/after this change alone, are large?  It's feeling
> > >
> > > I will be the first to admit thatI am not very good with the assembly
> > > side of things, but this is what I did:
> > >
> > > git checkout master
> > > make CROSS_COMPILE=arm-linux-gnueabihf- -j8
> > > arm-linux-gnueabihf-objdump -S u-boot > broken.dump
> > > git revert 262cfb5b15420a1aea465745a821e684b3dfa153
> > > make CROSS_COMPILE=arm-linux-gnueabihf- -j8
> > > arm-linux-gnueabihf-objdump -S u-boot > working.dump
> > >
> > > diff --side-by-side --suppress-common-lines broken.dump working.dump
> > > > broken-working.diff
> > > cat -n broken-working.diff
> > >
> > > The broken-working.diff file with common lines suppressed is 236256 lines long.
> >
> > OK, I just use '-d' and not '-S', which might help a little bit.  But
> > you're probably going to still need to edit the dumps and just globally
> > change all of the addresses to 'XXXXXXXX' so that you'll end up
> > hopefully only seeing where functions were optimized differently.  But
> > it might well end up being a bit trickier than that.
> 
> It looks like none of the object files are showing any content with
> objdump when LTO is enabled.  With a little google search, it appears
> we need lto-dump.  I have some more meetings, but I'll try to spend
> some more time on it this weekend.
> 
> >
> > > When I disable LTO for just pxe_utils.o and redo the same exercise,
> > > the diff file with common-lines removed is 266573 lines long.
> > >
> > > Maybe I am not using objdump correctly.  I am not all that familiar
> > > with this code either, so I am not sure which variables should be in
> > > BSS.  I did a search in both working and non-working dumps to look for
> > > keyworks like BSS, but from what I can tell,  both have similar
> > > functions:
> > >
> > > gd->mon_len = (ulong)&__bss_end - (ulong)_start;
> > > /* TODO: use (ulong)&__bss_end - (ulong)&__text_start; ? */
> > > gd->mon_len = (ulong)&__bss_end - CONFIG_SYS_MONITOR_BASE;
> > > gd->mon_len = (ulong)&__bss_end - (ulong)_start;
> > > * reserve memory for U-Boot code, data & bss
> > > 8011051a <clear_bss>:
> > > #if defined(CONFIG_SPL_BUILD) && defined(CONFIG_SPL_EARLY_BSS)
> > > CLEAR_BSS
> > > #if !defined(CONFIG_SPL_BUILD) || !defined(CONFIG_SPL_EARLY_BSS)
> > > CLEAR_BSS
> > > CLEAR_BSS
> > >
> > > When I grepped for mon_len, both sets of dumps looked nearly identical:
> > >
> > > aford at aford-OptiPlex-7050:~/src/u-boot$ grep mon_len working.dump
> > > lmb_reserve(lmb, (phys_addr_t)(uintptr_t)_start, gd->mon_len);
> > > 80112724 <setup_mon_len>:
> > > static int setup_mon_len(void)
> > > gd->mon_len = (ulong)&__bss_end - (ulong)_start;
> > > 80112726: 4903      ldr r1, [pc, #12] ; (80112734 <setup_mon_len+0x10>)
> > > 80112728: 4b03      ldr r3, [pc, #12] ; (80112738 <setup_mon_len+0x14>)
> > > gd->mon_len = (ulong)&__bss_end - CONFIG_SYS_MONITOR_BASE;
> > > gd->mon_len = (ulong)&__bss_end - (ulong)_start;
> > > gd->ram_top = board_get_usable_ram_top(gd->mon_len);
> > > gd->relocaddr -= gd->mon_len;
> > >       gd->mon_len >> 10, gd->relocaddr);
> > >     ip = mon_lengths[yleap];
> > >
> > >
> > > aford at aford-OptiPlex-7050:~/src/u-boot$ grep mon_len broken.dump
> > > lmb_reserve(lmb, (phys_addr_t)(uintptr_t)_start, gd->mon_len);
> > > 80110398 <setup_mon_len>:
> > > static int setup_mon_len(void)
> > > gd->mon_len = (ulong)&__bss_end - (ulong)_start;
> > > 8011039a: 4903      ldr r1, [pc, #12] ; (801103a8 <setup_mon_len+0x10>)
> > > 8011039c: 4b03      ldr r3, [pc, #12] ; (801103ac <setup_mon_len+0x14>)
> > > gd->mon_len = (ulong)&__bss_end - CONFIG_SYS_MONITOR_BASE;
> > > gd->mon_len = (ulong)&__bss_end - (ulong)_start;
> > > gd->ram_top = board_get_usable_ram_top(gd->mon_len);
> > > gd->relocaddr -= gd->mon_len;
> > >       gd->mon_len >> 10, gd->relocaddr);
> > >     ip = mon_lengths[yleap];
> > > aford at aford-OptiPlex-7050:~/src/u-boot$
> > >
> > > Since I think I narrowed it down to the pxe_utils.o file, I thought
> > > I'd do an objdump of both the working and non-working versions of
> > > pxe_utils.o and this is where it got interesting.
> > >
> > > With LTO building pxe_utils.o, the dump looks empty:
> > >
> > > arm-linux-gnueabihf-objdump -S boot/pxe_utils.o > pxe-notworking.dump
> > > cat pxe-notworking.dump
> > >
> > > boot/pxe_utils.o:     file format elf32-littlearm
> > >
> > > ^--  no actual code dump
> > > If I take the working version of this same file without LTO enabled
> > > and do a dump, and it's 2291 lines long and full of functions.
> > >
> > > I tried adding some __used to the non-static function names, but that
> > > didn't appear to make any difference to the objdump of pxe_utils.o
> >
> > I feel like it can't be pxe_utils.o itself but rather how LTO is
> > behaving before/after that change and sorting the object files
> > differently.  If modifying the dumps like I suggested above doesn't lead
> 
> That's what I was thinking too.
> 
> > to more clues, and it doesn't seem to matter what toolchain is used (are
> > you using the gcc-11 from kernel.org that we use in docker and
> > buildman?), I'll try and look as well.
> 
> I am using GCC 11, but I'm using the version that come with Ubuntu 21.10:
> 
> Thread model: posix
> Supported LTO compression algorithms: zlib zstd
> gcc version 11.2.0 (Ubuntu 11.2.0-5ubuntu1)

OK.  FWIW, if it's easier to build and test, I would suggest also trying
CFLAGS_REMOVE_xxx.o := $(LTO_CFLAGS)
for all of the obj files under arch/arm/ and board/ and then if that
also works correctly, re-adding the flags a directory, then file, at a
time until you've narrowed it down.

-- 
Tom
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 659 bytes
Desc: not available
URL: <https://lists.denx.de/pipermail/u-boot/attachments/20220211/529c0961/attachment.sig>


More information about the U-Boot mailing list