[U-Boot] [PATCH] Revert "fdt: Fix fdtdec_get_addr_size() for 64-bit"

Simon Glass sjg at chromium.org
Sun Aug 9 17:08:03 CEST 2015


Hi Stephen,

On 6 August 2015 at 13:03, Stephen Warren <swarren at wwwdotorg.org> wrote:
> On 08/05/2015 05:45 PM, Simon Glass wrote:
>>
>> Hi Stephen,
>>
>> On 5 August 2015 at 12:22, Stephen Warren <swarren at wwwdotorg.org> wrote:
>>>
>>> On 08/04/2015 10:08 PM, Simon Glass wrote:
>>>>
>>>>
>>>> Hi Stephen,
>>>>
>>>> On 3 August 2015 at 12:20, Stephen Warren <swarren at wwwdotorg.org> wrote:
>>>>>
>>>>>
>>>>> On 08/03/2015 09:52 AM, Simon Glass wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hi Stephen,
>>>>>>
>>>>>> On 3 August 2015 at 09:12, Stephen Warren <swarren at wwwdotorg.org>
>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 08/02/2015 06:13 PM, Simon Glass wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> This reverts commit 5b34436035fc862b5e8d0d2c3eab74ba36f1a7f4.
>>>>>>>>
>>>>>>>> This function has a few problems. It calls fdt_parent_offset() which
>>>>>>>> as
>>>>>>>> mentioned in code review is very slow.
>>>>>>>>
>>>>>>>> https://patchwork.ozlabs.org/patch/499482/
>>>>>>>> https://patchwork.ozlabs.org/patch/452604/
>>>>>>>>
>>>>>>>> It also happens to break SPI flash on Minnowboard max which is how I
>>>>>>>> noticed
>>>>>>>> that this was applied. I can send a patch to tidy that up, but in
>>>>>>>> any
>>>>>>>> case
>>>>>>>> I think we should consider a revert until the function is better
>>>>>>>> implemented.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> The fact that the function needs to perform a slow operation is not a
>>>>>>> good
>>>>>>> reason for a revert. The slowness of the operation is just a matter
>>>>>>> of
>>>>>>> fact
>>>>>>> with DT not having uplinks in its data structure, and U-Boot using
>>>>>>> those
>>>>>>> data structures directly.
>>>>>>>
>>>>>>> You'd requested during review that I additionally implement a faster
>>>>>>> version
>>>>>>> of the function in the case where the parent node is already known,
>>>>>>> and
>>>>>>> said
>>>>>>> it was fine if that happened in a later patch. I have this on my TODO
>>>>>>> list,
>>>>>>> but it's only been a couple of days.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> I didn't expect this to go to mainline before your new patch.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I didn't get that message from the thread; you wrote:
>>>>>
>>>>> Simon Glass wrote:
>>>>>>
>>>>>>
>>>>>> Stephen Warren wrote:
>>>>>>>
>>>>>>>
>>>>>>> Simon Glass wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> Also, how about (in addition) a
>>>>>>>> version of this function that works for devices? Like:
>>>>>>>>
>>>>>>>> device_get_addr_size(struct udevice *dev, ...)
>>>>>>>>
>>>>>>>> so that it can handle this for you.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> That sounds like a separate patch?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Yes, but I think we should get it in there so that people don't start
>>>>>> using this (wildly inefficient) function when they don't need to. I
>>>>>> think by passing in the parent node we force people to think about the
>>>>>> cost.
>>>>>>
>>>>>> Yes the driver model function can be a separate patch, but let's get
>>>>>> it in at about the same time. We have dev_get_addr() so could have
>>>>>> dev_get_addr_size().
>>>>>
>>>>>
>>>>>
>>>>> That sounds to like you'd like a followup patch soon after this patch
>>>>> (my
>>>>> assumption would be within a few days or weeks) to solve your comments,
>>>>> not
>>>>> that you wanted a replacement patch.
>>>>
>>>>
>>>>
>>>> I will take that feedback to be a little more forceful in future, sorry.
>>>>
>>>>>
>>>>>>> Why not just fix the bug since you said you could? That seems far
>>>>>>> better
>>>>>>> than breaking the newly added Tegra210 support.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> I do have a patch but I'm not 100% comfortable with the approach. I
>>>>>> don't see a good reason to move away from the idea of fdt_addr_t and
>>>>>> fdt_addr_t being set correctly for the platform. Or maybe I
>>>>>> misunderstand the problem the patch was trying to fix. As I said it
>>>>>> did not have a commit message, so who knows :-)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> The size of fdt_addr_t isn't relevant *at all* when parsing DT. The
>>>>> only
>>>>> thing that matters is #address-cells/#size-cells. Those properties are
>>>>> what
>>>>> tell the parsing code how many bytes to read from the reg property.
>>>>> Whether
>>>>> the resultant value fits into the code's internal representation is an
>>>>> internal detail of the code, not part of the semantics of DT itself or
>>>>> how
>>>>> to parse it.
>>>>>
>>>>> If code assumes that #address-cells == sizeof(fdt_addr_t), it is indeed
>>>>> quite likely that everything will just happen to work most of the time.
>>>>> However, this is purely an accident and not something that anything
>>>>> should
>>>>> rely upon.
>>>>>
>>>>> (I think Tegra210 support still has CONFIG_PHYS_64BIT undefined which
>>>>> is
>>>>> admittedly a little /unexpected/ for a 64-bit U-Boot build, but in
>>>>> practice
>>>>> is perfectly /legal/ and will work out just fine since, except perhaps
>>>>> for
>>>>> RAM sizes, I don't believe any value in DT will actually require more
>>>>> than
>>>>> 32-bits to represent)
>>>>
>>>>
>>>>
>>>> I would like to have the types match the platform where possible. At
>>>> least the types should not be smaller than the platform - e.g. if
>>>> CONFIG_PHYS_64BIT is not defined we should not support 64-bit
>>>
>>>
>>>
>>> In general, there's no "should not" here; we "cannot". If there's a
>>> 64-bit
>>> value in the DT (with bits above bit 31 set), then it can't be stored in
>>> a
>>> 32-bit variable.
>>>
>>> That said, if a DT has #address-cells=<2>, but the actual values stored
>>> in
>>> all reg values have 0 for the MS word, that'll actually work just fine.
>>> Note
>>> that it's 100% legal to set #address-cells=<100> and just write a bunch
>>> of
>>> extra zeros into the property. Silly perhaps, but perfectly legal. Since
>>> the
>>> function should adapt to whatever #address-cells value is in the DT,
>>> supporting that case isn't any more work, so there's no reason we
>>> shouldn't.
>>>
>>>> address/size in the device tree. This is for efficiency. We don't want
>>>> to force all the U-Boot code to 64-bit suddenly. This will bloat
>>>> things for no benefit.
>>>
>>>
>>>
>>> We could and likely should set CONFIG_PHYS_64BIT for Tegra210. However,
>>> that's unrelated to using the correct algorithm to parse DT.
>>>
>>>>>>> P.S. What is the issue with SPI flash? The commit description doesn't
>>>>>>> mention this at all.
>>>>>>
>>>>>>
>>>>>>
>>>>>> It calls that function expecting it to pick up an address and size
>>>>>> from two consecutive cells. With this patch, that fails (unless the
>>>>>> property happens to be "reg").
>>>
>>>
>>> ...
>>>>>
>>>>>
>>>>> I think this is all stemming from drivers/mtd/spi/sf_probe.c
>>>>> spi_flash_decode_fdt()'s call to fdtdec_get_addr_size() for property
>>>>> "memory-map"? If so, then looking at arch/x86/dts/minnowmax.dts's /spi
>>>>> node,
>>>>> the code and DT content are clearly inconsistent; For this node
>>>>> #address-cells=<1>, #size-cells=<0> which makes sense given that the
>>>>> address
>>>>> is a chip-select and hence has no size. So, the code should not assume
>>>>> that
>>>>> the memory-map property can be parsed in the same way as a reg
>>>>> property.
>>>>>
>>>>> I note that the memory-map property doesn't exist in the Linux kernel's
>>>>> DT
>>>>> binding documentation database, or code, hence hasn't been reviewed by
>>>>> the
>>>>> DT binding maintainers.
>>>>
>>>>
>>>>
>>>> The function comment says:
>>>>
>>>>    * Look up an address property in a node and return it as an address.
>>>>    * The property must hold one address with a length. This is only
>>>> tested
>>>>    * on 32-bit machines.
>>>
>>>
>>>
>>> And Thierry fixed it for systems with #address-cells > 1. Perhaps that
>>> part
>>> of the function comment should have been removed in the commit.
>>>
>>>> My intention was that this would be an efficient way to decode an
>>>> address and size from a device tree. To some extent regmap may take
>>>> over this role (IMO we should turn to drop fdtdec one day one we have
>>>> more infrastructure). But I'd like it to work efficiently for 32-bit
>>>> machines. The new function could hardly be less efficient.
>>>
>>>
>>>
>>> Again, there's no way in general to make it more efficient. The
>>> efficiency
>>> issue is directly implied by the DT data structures.
>>>
>>> In the special case where the parent node is already known, we can
>>> certainly
>>> introduce an alternate function that is more efficient. You've already
>>> asked
>>> for that, and as I said, it's in my TODO list.
>>>
>>>> I think we should consider the case where the physical address size of
>>>> U-Boot and the device tree do not match as a corner case. I certainly
>>>> don't want device tree to add loads of pointless code for 'normal'
>>>> platforms.
>>>
>>>
>>>
>>> It's not a corner case. It's a fundamental part of the DT schema. If
>>> U-Boot
>>> is going to use DT, it should actually use *DT*, not some-very
>>> similar-but-not-quite-DT format with all kinds of implicit and unstated
>>> exceptions, limitations, and assumptions. This implies fully honoring all
>>> aspects of how DT works when parsing it, not picking and choosing
>>> features
>>> because some are inconvenient or annoying. If U-Boot doesn't want to
>>> correctly implement DT support, it should just drop it completely.
>>>
>>> As an aside, when instantiating devices, I hope one day that U-Boot will
>>> parse DT top-down in a hierarchical way, rather than simply searching for
>>> any node anywhere in the DT without regard for whether any parent node is
>>> enabled, has a driver, has had the driver initialize, etc. U-Boot ignores
>>> this right now, and is only getting away with this accidentally. Without
>>> hacky workarounds in drivers, this won't continue to work for all Tegra
>>> HW
>>> (e.g. host1x graphics/display sub-modules, AHUB's audio-related
>>> sub-modules), since parent drivers must initialize before child drivers
>>> in
>>> order to enable various register buses, including clocks/resets affecting
>>> those buses etc.
>>>
>>> Again, if it's simply too inconvenient or bloated to implement DT
>>> properly
>>> in U-Boot, let's just drop it entirely. A halfway solution is the worst
>>> of
>>> both worlds. I'm not convinced the full implications of how to (and the
>>> need
>>> to) correctly and fully support DT have were well thought through before
>>> (control) DT support was added to U-Boot.
>>>
>>>> Re memory-map, yes it doesn't seem to be possible to do what it is
>>>> trying to do (and Thierry says the same below). It is quite weird to
>>>> have a SPI peripheral which is also memory mapped.
>>>>
>>>> Here's my question - if you fix the CONFIG_PHYS_64BIT problem with
>>>> 64-bit Tegra, what is actually wrong with the way the function was?
>>>
>>>
>>>
>>> With the DT files now checked into U-Boot, I think it would accidentally
>>> work, since we just happen to have set #address-cells=<2>,
>>> #size-cells=<2>,
>>> and that would just happen to match sizeof(fdt_addr_t).
>>>
>>> However note this is an accident on a couple of levels:
>>>
>>> a) This is because the code assumes that sizeof(fdt_addr_t) ==
>>> #address-cells * 4. This is an invalid assumption since it does not
>>> correctly honor the DT schema. It hard-codes the size of values whereas
>>> DT
>>> schema says the size is defined by the #xxx-cells properties.
>>>
>>> b) The original Tegra210 DTs that TomW posted had #address-cells=<1>,
>>> #size-cells=<1>. I asked him to change that to match what I expected to
>>> be
>>> in the Linux kernel's Tegra210 DTs. However, if he'd rejected my request
>>> or
>>> I hadn't noticed that, then with CONFIG_PHYS_64BIT set,
>>> fdtdec_get_addr_size() would have attempted to read twice as many bytes
>>> as
>>> it should have from the property. It's entirely plausible that someone
>>> could
>>> have come along later and realized CONFIG_PHYS_64BIT was set incorrectly
>>> and
>>> enabled it.
>>>
>>>> This is a boot loader so we should be willing to make some
>>>> simplifications.
>>>
>>>
>>>
>>> When dealing with internal bootloader details, sure assumptions,
>>> simplifications, etc. can be made.
>>>
>>> However, DT is an externally defined standard. The content of DT must be
>>> identical across all OSs (SW stacks, bootloader) and not influenced by
>>> requirements/... of any specific individual OS's (SW stack, bootloader)
>>> quirks. We can't just pick and choose which parts of it we care about.
>>> Well,
>>> perhaps if we stop calling it DT we could.
>>
>>
>> So I think in summary:
>>
>> - 64-bit machines should have CONFIG_PHYS_64BIT set correctly
>
>
> It turns out that arch/arm/include/asm/config.h already enables this for all
> ARM64 platforms.
>
> As such, we can in fact go ahead with reverting this patch, and U-Boot will
> still function on Tegra210 boards.
>
> In the short term, I think that means TomR should just apply this revert
> patch, and we don't need to send any additional patches. In the slightly
> longer term, we should add some comments to fdtdec_get_addr_size()
> describing its problems, and slightly longer term, add back Thierry's patch,
> but in a way that lets callers specify whether #address-cells/#size-cells
> should be used, or whether caller-supplied hard-coded values should be used.

OK great. But I see your new patch so I think we can apply both at the
same time.

>
> I apologize for not noticing this earlier; I'd assumed that since none of
> the Tegra-specific files in include/configs/ set this flag, nor any of the
> Tegra-specific Kconfig files, it wasn't set, and hence a revert of the patch
> would break Tegra210 support.

No problem - I assumed it would also.

>
>> - then fdtdec_get_addr_size() would work as expected
>
>
> I take issue with *works* as expected", since I would expect the function to
> implement DT parsing rules completely.
>
> However, it's certainly true to say that it will generate the desired
> results, even if it doesn't /work/ (isn't implemented) as expected.
>
> (I suppose this depends on whether you're talking about "works" w.r.t to
> correctness of the returned results or side-effects, or w.r.t. how the
> internal implementation works.)
>
>> - I want to make this cases as efficient as possible since it will be
>> called in SPL
>> - You are concerned that making assumptions like this violates the DT spec
>>
>> One option is to split the functions into two - one that works in SPL
>> and makes assumptions, and one that does not and does things the hard
>> way.
>
>
> Why should SPL be any different? U-Boot should either parse DT correctly or
> not at all. SPL shouldn't be a special case. Admittedly SPL on Tegra does
> very little (and isn't even present on Tegra210), but in general, can't
> someone use storage drivers, filesystems, etc. in SPL to choose the next
> stage to load, read GPIOs or other data sources to select between different
> boot paths, perhaps even interact with a user? If so, then assuming that SPL
> can somehow implements a reduced set of features, and hence can make
> assumptions that non-SPL can't, seems quite dangerous. We could only do that
> if we put some active checks in the U-Boot makefiles to ensure that nobody
> enabled anything in the SPL besides a set of strictly audited features.

Yes we could do that and it would avoid pain later. I suppose SPL on
Tegra is a bit of a special case since there is really no size limit.
For some chips the limits are pretty severe so I am quite sensitive to
code size even at the expense of extra debug time when something
breaks.

>
>> I suppose we could also add checks/warnings that the DT is
>>
>> 'well-formed' and that the address size matches the machine it is
>> running on.
>
>
> Yes, we certainly should do that.
>
>> In any case we do need to get rid of the parent lookup in this
>> function. So can either you or Thierry submit a patch to do this? The
>> parent should instead be a parameter to the function. You may need to
>> create a stub function which looks it up before calling
>> fdtdec_get_addr_size().
>
>
> Yes, we can't remove the parent lookup in all cases. However, we can avoid
> it in the cases where the caller can supply it. I think that's most, but not
> quite all.

My main concern is dev_get_addr() since that is the official way of
reading a device address now. See also regmap_init_mem() which does
things its own way.

>
>
>> I'll see how things look in SPL.
>>
>> Regards,
>> Simon
>>
>

Regards,
Simon


More information about the U-Boot mailing list