binman warning/failure when blobs are missing
Simon Glass
sjg at chromium.org
Mon Aug 15 19:37:57 CEST 2022
Hi Tom,
On Sat, 13 Aug 2022 at 06:36, Tom Rini <trini at konsulko.com> wrote:
>
> On Fri, Aug 12, 2022 at 08:21:05PM -0600, Simon Glass wrote:
> > Hi Tom,
> >
> > On Fri, 12 Aug 2022 at 10:11, Tom Rini <trini at konsulko.com> wrote:
> > >
> > > On Fri, Aug 12, 2022 at 09:11:11AM -0600, Simon Glass wrote:
> > > > Hi Tom,
> > > >
> > > > On Thu, 11 Aug 2022 at 19:03, Tom Rini <trini at konsulko.com> wrote:
> > > > >
> > > > > On Thu, Aug 11, 2022 at 06:08:51PM -0600, Simon Glass wrote:
> > > > > > Hi,
> > > > > >
> > > > > > On Thu, 11 Aug 2022 at 11:44, Tom Rini <trini at konsulko.com> wrote:
> > > > > > >
> > > > > > > On Thu, Aug 11, 2022 at 09:59:05AM -0700, Tim Harvey wrote:
> > > > > > > > On Thu, Aug 11, 2022 at 9:48 AM Tom Rini <trini at konsulko.com> wrote:
> > > > > > > > >
> > > > > > > > > On Thu, Aug 11, 2022 at 09:27:44AM -0700, Tim Harvey wrote:
> > > > > > > > >
> > > > > > > > > > Greetings,
> > > > > > > > > >
> > > > > > > > > > After a couple of hours troubleshooting a bad boot image today I
> > > > > > > > > > realized the issue was that I had some 0 byte files for the lpddr4
> > > > > > > > > > training blobs that are part of the imx8mp binman created image.
> > > > > > > > > >
> > > > > > > > > > Digging in I found that if a blob referenced in the binman node is
> > > > > > > > > > missing a warning will be output but the missing files will be
> > > > > > > > > > 'created' as 0 byte files such that the next time you build you will
> > > > > > > > > > get no warning (but will have a non-working image). Additionally the
> > > > > > > > > > error does not cause a non-zero exit code.
> > > > > > > > > >
> > > > > > > > > > I'm not that fluent in python these days, and don't have the time for
> > > > > > > > > > a while to try and fix this but I figured I would at least send this
> > > > > > > > > > email in case someone else does.
> > > > > > > > > >
> > > > > > > > > > Example:
> > > > > > > > > > # rm *lpddr4*.bin # make sure lpddr4*.bin files referenced in binman
> > > > > > > > > > nodes are missing
> > > > > > > > > > # make distclean imx8mp_venice_defconfig flash.bin && echo "build ok"
> > > > > > > > > > ...
> > > > > > > > > > BINMAN flash.bin
> > > > > > > > > > Image 'main-section' is missing external blobs and is non-functional: ddr-1d-ime
> > > > > > > > > > m-fw ddr-1d-dmem-fw ddr-2d-imem-fw ddr-2d-dmem-fw
> > > > > > > > > > Image 'main-section' has faked external blobs and is non-functional: lpddr4_pmu_
> > > > > > > > > > train_1d_imem_202006.bin lpddr4_pmu_train_1d_dmem_202006.bin lpddr4_pmu_train_2d
> > > > > > > > > > _imem_202006.bin lpddr4_pmu_train_2d_dmem_202006.bin
> > > > > > > > > >
> > > > > > > > > > Some images are invalid
> > > > > > > > > > ^^^ excellent warning
> > > > > > > > > > build ok
> > > > > > > > > > ^^^ not so great that there is a successful exit code
> > > > > > > > > > # make flash.bin && echo "build ok"
> > > > > > > > > > ...
> > > > > > > > > > BINMAN flash.bin
> > > > > > > > > > build ok
> > > > > > > > > > ^^^ absolutely horrible that 0 byte files were created and thus
> > > > > > > > > > everything looks good this time around!
> > > > > > > > > > # stat -c "%s %n" lpddr4*.bin
> > > > > > > > > > 0 lpddr4_pmu_train_1d_dmem_202006.bin
> > > > > > > > > > 0 lpddr4_pmu_train_1d_imem_202006.bin
> > > > > > > > > > 0 lpddr4_pmu_train_2d_dmem_202006.bin
> > > > > > > > > > 0 lpddr4_pmu_train_2d_imem_202006.bin
> > > > > > > > >
> > > > > > > > > So, this isn't the first time someone has had this problem. On the one
> > > > > > > > > hand, we need CI to pass, and not require fetching of arbitrary further
> > > > > > > > > images to assemble the binary. On the other hand, we don't want users
> > > > > > > > > spending a bunch of time because something didn't work and the normal
> > > > > > > > > way of conveying THIS WON'T WORK is a non-zero exit status. Can we
> > > > > > > > > easily make some flag for buildman or binman that we do set in CI but
> > > > > > > > > won't be set by users?
> > > > > > > > >
> > > > > > > >
> > > > > > > > Tom,
> > > > > > > >
> > > > > > > > Ok - that makes sense as far as the exit status goes. It would be MUCH
> > > > > > > > easier to catch an error like this if binman didn't create 0 byte
> > > > > > > > files for missing files so that you at least get the (ascii colored)
> > > > > > > > message indicating the image is wrong.
> > > > > >
> > > > > > Please see this:
> > > > > >
> > > > > > https://patchwork.ozlabs.org/project/uboot/patch/20220807154708.1418967-2-sjg@chromium.org/
> > > > > >
> > > > > > > >
> > > > > > > > I do love the idea of a flag for CI also!
> > > > > > >
> > > > > > > So, that's part of the root of the problem here. We're passing
> > > > > > > --fake-ext-blobs to binman so that it will create those 0 byte files
> > > > > > > and it in turn complain. I think we didn't figure out a good way to
> > > > > > > tell buildman to pass that to binman tho?
> > > > > >
> > > > > > I hope the above patch will fix this part of the problem.
> > > > >
> > > > > I'm not sure. What I really want to see is something like buildman
> > > > > --fake-ext-blobs, which in turn does 'make BINMANFLAGS=--fake-ext-blobs'
> > > > > or 'make BINMAN_FAKE_EXT_BLOBS=1' and that is what sets --fake-ext-blobs
> > > > > to be passed inside of cmd_binman, so that we can tell CI to do that
> > > > > (just like we do -E today, to ensure -Werror) but regular users get
> > > > > build failures.
> > > >
> > > > IMO we should add my patch as well as what you are saying here. They
> > > > are handling slightly different issues.
> > >
> > > Yes, the above is useful, to fix a different issue (and should fix some
> > > issues in my before/after world builds).
> >
> > OK good. Yes, a different issue as you say.
> >
> > >
> > > > As to this one, I can make binman return an error result if any images
> > > > are invalid, controlled with a -W flag, perhaps, as with buildman.
> > > >
> > > > It is not just faked blobs - we need to raise a warning/error for
> > > > missing ones too.
> > > >
> > > > Then we could pass -W from CI using an env var picked up by the
> > > > Makefile as you say. It could control whether --allow-missing and
> > > > --fake-ext-blobs are passed to binman.
> > > >
> > > > Then a normal build will produce errors.
> > > >
> > > > We still need buildman to work locally though, for build testing. I
> > > > think returning exit code 101 could be good enough for that.
> > > >
> > > > Anyway, as I say, these are things for future improvement in addition
> > > > to my patch, IMO.
> > >
> > > I really think this needs to be a fails by default situation, and a new
> > > flag to buildman to say allow for CI and similar test builds to
> > > complete (that in turn passes the right flag(s) to binman), and complain
> > > without non-zero exit status. But the default should be visible /
> > > descriptive failure with a normal non-zero exit status.
> >
> > I've been thinking about this. In practice there is no way that a
> > 'buildman -b' build can have the right binary blobs in place. Where
> > would they come from? So it would always fail. I don't think that is
> > really what we want? I use buildman in this way all the time. I also
> > hate binary blobs and don't want them to interrupt my workflow. As you
> > know the LTO issue has more than doubled my incremental build times
> > for sandbox and some other boards and I've been in this situation for
> > a long time (please can you apply that fix? :-)
> >
> > I agree that the default should be a visible / descriptive failure,
> > but perhaps we can start with sorting out Makefile, so that by default
> > it fails and you have to pass an arg to get it to handle missing or
> > fake blobs. I should have done that at the start. That should fix most
> > people. This is your env var idea, and I agree with that. I can do a
> > patch.
> >
> > Then we get to the people who use buildman for their builds instead of
> > make. That has recently included me on some machines, now we have the
> > --ide and -w flags, but I'm not sure how common it is. We don't
> > document buildman as a way of doing development. Anyway, we need this
> > to fail too, by default. In this case it is normally building a single
> > board and using the -w flag.
> >
> > So perhaps we could have a different rule for -b or multiple boards
> > than for a 'current source' or '-w' build? I am not sure how we could
> > do this cleanly, but we could perhaps have a flag like --blobs with
> > several options:
> >
> > --blobs fail | allow | fake
> >
> > where the default value (if no --blobs flag is provided) varies
> > depending on other flags?
> >
> > As I read this I suppose you will think this is overkill. But I am
> > really thinking about how this would affect development and there are
> > so many things already in the way.
>
> I think we're almost on the same page here. What I want is for:
> tools/buildman/buildman --fake-ext-blobs
> to do 'make BINMAN_FAKE_BLOBS=1' and so this then works:
> diff --git a/Makefile b/Makefile
> index 1a66f69a4b14..6c406735564c 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1346,8 +1346,8 @@ cmd_binman = $(srctree)/tools/binman/binman $(if $(BINMAN_DEBUG),-D) \
> $(foreach f,$(BINMAN_TOOLPATHS),--toolpath $(f)) \
> --toolpath $(objtree)/tools \
> $(if $(BINMAN_VERBOSE),-v$(BINMAN_VERBOSE)) \
> - build -u -d u-boot.dtb -O . -m --allow-missing \
> - --fake-ext-blobs \
> + build -u -d u-boot.dtb -O . -m \
> + $(if $(BINMAN_FAKE_BLOBS),--allow-missing --fake-ext-blobs) \
> -I . -I $(srctree) -I $(srctree)/board/$(BOARDDIR) \
> -I arch/$(ARCH)/dts -a of-list=$(CONFIG_OF_LIST) \
> $(foreach f,$(BINMAN_INDIRS),-I $(f)) \
>
>
> So that then you're either using buildman and can pass the new flag to
> get binman to act like before, or you're going to see a more visible
> failure when using make and not do what needs to be done (which isn't
> always consistent, sigh) to say where to find what blobs.
Yes that seems OK to me, except that I think buildman should default
to faking blobs when building a branch, since otherwise it will
(almost) never work. Did you see my thoughts about the policy stuff?
So in my view we need at least three changes:
- the 'fake files in a subdir' patch I already sent
- the Makefile patch as above
- some policy arg to buildman so people don't have to manually enable
fake blobs when building multiple boards
Regards,
Simon
More information about the U-Boot
mailing list