[PATCH 2/2] schemas: Add a schema for binman

Thu Aug 3 21:29:47 CEST 2023

(I think I've strayed far away from schema/dtb side of things towards
binman-the-program side, so I'm dropping Rob and devicetree@ from Cc).

On 2023-07-20 22:55 +03:00, Simon Glass wrote:
> On Thu, 20 Jul 2023 at 09:23, Alper Nebi Yasak <alpernebiyasak at gmail.com> wrote:
>> There's also the issue of binman having multiple different device-trees:
>> its input, the ones in fdtmaps per image, the ones injected to U-Boot
>> dtb files per image. I'd say each has different needs, and those
>> differences have to be figured out before specified upstream. I can only
>> guess this is about the one that'll be in a u-boot.dtb.
> 
> Well, there is really only one. The fdtmap is actually the same
> schema, except that it mentions only the image that it is embedded in,
> i.e. if the fdtmap is for the SPI image, then the fdtmap in SPI flash
> only contains the definition for the SPI image, not the MMC image
> which is in a different device. The input is the same schema, albeit
> that things like 'offset' may be filled in by Binman automatically
> when it packs things.

I was thinking of things like generator nodes and templates and
"filename"s in blob entries that (IMO) only make sense as binman inputs
and never should be in a fdtmap. Trying to highlight a difference
between "how to build this image" and "what this image contains".

But I guess it's OK to call them the same schema unless something has
two conflicting meanings in input-domain and output-domain, and I can't
think of any counter-examples now.

>> I want to go through binman more thoroughly, but a lot of changes
>> happened since I last looked at it and I'm a bit slow at writing things,
>> so won't exactly be soon. That being said, here's some ideas off the top
>> of my head, for inspiration on both this schema and binman itself.
> 
> Do you mean the code? There are definitely some abstractions that are
> stretching a bit, but it is mostly holding together for now.

I mean both the implementation and the design. There's a lot more on my
mind, some more examples:

- Precise structures for data instead of thinking of them as black boxes
- Heuristically parsing arbitrary images/data for e.g. binman extract
- Deconstructing input files to reuse their pieces for building images
- Explicit dependency tracking and resolution for data and layout
- Making things act more like native Python objects
- In fact, making it entirely operable with just Python code
- Decoupling internals from the control dtbs ("entry._node" etc.)
- Ensuring reproducible builds and testing for it
- Fuzzing as a replacement for most tests
- ...

I think it has the beginnings of a very nice declarative-style tool, but
has to embrace that style to get there. (And I guess I'll have to do the
work to properly express myself on some points...)

> [...]

>> Or maybe even better, I think we should make it like FIT: allow a "data"
>> property that has it inside the dtb, or a pair of "data-offset/position"
>> and "data-size" properties if it's outside.
> 
> Eh? The point of the entry is to declare the position of actual data.
> If the data is elsewhere, then the entry will be too.

Sorry, I'm jumping into a different contexts here without explaining it.
I'm seeing a similarity between images that start with a fdtmap and the
images built by "mkimage -T fit -E". Then I'm extrapolating to the
non-"-E" case. Then extending to make it possible for a "fdtmap + data"
binman description to result in something almost backwards-compatible
with FIT, which could replace most uses of a fit etype.

(Of course, backwards compatible only if you add config nodes, and
flatten or don't nest entries, and whatnot. And fit etype would still stay.)

>> Inlining data inside the dtb
>> could help us do nice things in binman, like hashes/signatures as entry
>> types instead of special-casing them.
> 
> We already do that though, right? See testHash() for example. This is
> a powerful feature of a firmware-layout description / Binman, IMO,
> since all the metadata you need is right there.

Having that functionality in state.py and having to work around it for
mkimage/fit etypes (because we want to embed them into the data instead
of the binman dtb) is what I mean by special-casing.

If we had them as etypes, and could embed arbitrary entries as "data" in
binman dtb, I think we would have the best of both worlds -- avoid
calling mkimage just for hash/signature embedding, have it still
possible to put those in binman metadata, and enable a foundation for
other sign-verify flows (leaning towards replacing custom signing
tools/scripts with binman).

>> In fact it could be possible to turn binman images into a FIT 2.0 if we
>> do some more work on top of that, instead of nesting a FIT inside a
>> binman image.
> 
> Well binman needs to produce things which are not FITs. Bear in mind
> that the output can be a binary file in any format. It doesn't
> necessarily have to have any metadata in it at all, certainly not a
> FIT header.

I know, I don't mean it as the only mode of operation, I hope I have
explained better this time.

> Thanks again for your encouraging comments. I'll do a v2 with the above changes.

I'm glad to hear that you appreciate my comments, because trying to
review your work always feels quite hubristic to me, haha.