[U-Boot] [PATCH] Revert "fdt: Fix fdtdec_get_addr_size() for 64-bit"

Thu Aug 6 21:03:40 CEST 2015

On 08/05/2015 05:45 PM, Simon Glass wrote:
> Hi Stephen,
>
> On 5 August 2015 at 12:22, Stephen Warren <swarren at wwwdotorg.org> wrote:
>> On 08/04/2015 10:08 PM, Simon Glass wrote:
>>>
>>> Hi Stephen,
>>>
>>> On 3 August 2015 at 12:20, Stephen Warren <swarren at wwwdotorg.org> wrote:
>>>>
>>>> On 08/03/2015 09:52 AM, Simon Glass wrote:
>>>>>
>>>>>
>>>>> Hi Stephen,
>>>>>
>>>>> On 3 August 2015 at 09:12, Stephen Warren <swarren at wwwdotorg.org> wrote:
>>>>>>
>>>>>>
>>>>>> On 08/02/2015 06:13 PM, Simon Glass wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> This reverts commit 5b34436035fc862b5e8d0d2c3eab74ba36f1a7f4.
>>>>>>>
>>>>>>> This function has a few problems. It calls fdt_parent_offset() which
>>>>>>> as
>>>>>>> mentioned in code review is very slow.
>>>>>>>
>>>>>>> https://patchwork.ozlabs.org/patch/499482/
>>>>>>> https://patchwork.ozlabs.org/patch/452604/
>>>>>>>
>>>>>>> It also happens to break SPI flash on Minnowboard max which is how I
>>>>>>> noticed
>>>>>>> that this was applied. I can send a patch to tidy that up, but in any
>>>>>>> case
>>>>>>> I think we should consider a revert until the function is better
>>>>>>> implemented.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> The fact that the function needs to perform a slow operation is not a
>>>>>> good
>>>>>> reason for a revert. The slowness of the operation is just a matter of
>>>>>> fact
>>>>>> with DT not having uplinks in its data structure, and U-Boot using
>>>>>> those
>>>>>> data structures directly.
>>>>>>
>>>>>> You'd requested during review that I additionally implement a faster
>>>>>> version
>>>>>> of the function in the case where the parent node is already known, and
>>>>>> said
>>>>>> it was fine if that happened in a later patch. I have this on my TODO
>>>>>> list,
>>>>>> but it's only been a couple of days.
>>>>>
>>>>>
>>>>>
>>>>> I didn't expect this to go to mainline before your new patch.
>>>>
>>>>
>>>>
>>>> I didn't get that message from the thread; you wrote:
>>>>
>>>> Simon Glass wrote:
>>>>>
>>>>> Stephen Warren wrote:
>>>>>>
>>>>>> Simon Glass wrote:
>>>>>>>
>>>>>>> Also, how about (in addition) a
>>>>>>> version of this function that works for devices? Like:
>>>>>>>
>>>>>>> device_get_addr_size(struct udevice *dev, ...)
>>>>>>>
>>>>>>> so that it can handle this for you.
>>>>>>
>>>>>>
>>>>>> That sounds like a separate patch?
>>>>>
>>>>>
>>>>> Yes, but I think we should get it in there so that people don't start
>>>>> using this (wildly inefficient) function when they don't need to. I
>>>>> think by passing in the parent node we force people to think about the
>>>>> cost.
>>>>>
>>>>> Yes the driver model function can be a separate patch, but let's get
>>>>> it in at about the same time. We have dev_get_addr() so could have
>>>>> dev_get_addr_size().
>>>>
>>>>
>>>> That sounds to like you'd like a followup patch soon after this patch (my
>>>> assumption would be within a few days or weeks) to solve your comments,
>>>> not
>>>> that you wanted a replacement patch.
>>>
>>>
>>> I will take that feedback to be a little more forceful in future, sorry.
>>>
>>>>
>>>>>> Why not just fix the bug since you said you could? That seems far
>>>>>> better
>>>>>> than breaking the newly added Tegra210 support.
>>>>>
>>>>>
>>>>>
>>>>> I do have a patch but I'm not 100% comfortable with the approach. I
>>>>> don't see a good reason to move away from the idea of fdt_addr_t and
>>>>> fdt_addr_t being set correctly for the platform. Or maybe I
>>>>> misunderstand the problem the patch was trying to fix. As I said it
>>>>> did not have a commit message, so who knows :-)
>>>>
>>>>
>>>>
>>>> The size of fdt_addr_t isn't relevant *at all* when parsing DT. The only
>>>> thing that matters is #address-cells/#size-cells. Those properties are
>>>> what
>>>> tell the parsing code how many bytes to read from the reg property.
>>>> Whether
>>>> the resultant value fits into the code's internal representation is an
>>>> internal detail of the code, not part of the semantics of DT itself or
>>>> how
>>>> to parse it.
>>>>
>>>> If code assumes that #address-cells == sizeof(fdt_addr_t), it is indeed
>>>> quite likely that everything will just happen to work most of the time.
>>>> However, this is purely an accident and not something that anything
>>>> should
>>>> rely upon.
>>>>
>>>> (I think Tegra210 support still has CONFIG_PHYS_64BIT undefined which is
>>>> admittedly a little /unexpected/ for a 64-bit U-Boot build, but in
>>>> practice
>>>> is perfectly /legal/ and will work out just fine since, except perhaps
>>>> for
>>>> RAM sizes, I don't believe any value in DT will actually require more
>>>> than
>>>> 32-bits to represent)
>>>
>>>
>>> I would like to have the types match the platform where possible. At
>>> least the types should not be smaller than the platform - e.g. if
>>> CONFIG_PHYS_64BIT is not defined we should not support 64-bit
>>
>>
>> In general, there's no "should not" here; we "cannot". If there's a 64-bit
>> value in the DT (with bits above bit 31 set), then it can't be stored in a
>> 32-bit variable.
>>
>> That said, if a DT has #address-cells=<2>, but the actual values stored in
>> all reg values have 0 for the MS word, that'll actually work just fine. Note
>> that it's 100% legal to set #address-cells=<100> and just write a bunch of
>> extra zeros into the property. Silly perhaps, but perfectly legal. Since the
>> function should adapt to whatever #address-cells value is in the DT,
>> supporting that case isn't any more work, so there's no reason we shouldn't.
>>
>>> address/size in the device tree. This is for efficiency. We don't want
>>> to force all the U-Boot code to 64-bit suddenly. This will bloat
>>> things for no benefit.
>>
>>
>> We could and likely should set CONFIG_PHYS_64BIT for Tegra210. However,
>> that's unrelated to using the correct algorithm to parse DT.
>>
>>>>>> P.S. What is the issue with SPI flash? The commit description doesn't
>>>>>> mention this at all.
>>>>>
>>>>>
>>>>> It calls that function expecting it to pick up an address and size
>>>>> from two consecutive cells. With this patch, that fails (unless the
>>>>> property happens to be "reg").
>>
>> ...
>>>>
>>>> I think this is all stemming from drivers/mtd/spi/sf_probe.c
>>>> spi_flash_decode_fdt()'s call to fdtdec_get_addr_size() for property
>>>> "memory-map"? If so, then looking at arch/x86/dts/minnowmax.dts's /spi
>>>> node,
>>>> the code and DT content are clearly inconsistent; For this node
>>>> #address-cells=<1>, #size-cells=<0> which makes sense given that the
>>>> address
>>>> is a chip-select and hence has no size. So, the code should not assume
>>>> that
>>>> the memory-map property can be parsed in the same way as a reg property.
>>>>
>>>> I note that the memory-map property doesn't exist in the Linux kernel's
>>>> DT
>>>> binding documentation database, or code, hence hasn't been reviewed by
>>>> the
>>>> DT binding maintainers.
>>>
>>>
>>> The function comment says:
>>>
>>>    * Look up an address property in a node and return it as an address.
>>>    * The property must hold one address with a length. This is only tested
>>>    * on 32-bit machines.
>>
>>
>> And Thierry fixed it for systems with #address-cells > 1. Perhaps that part
>> of the function comment should have been removed in the commit.
>>
>>> My intention was that this would be an efficient way to decode an
>>> address and size from a device tree. To some extent regmap may take
>>> over this role (IMO we should turn to drop fdtdec one day one we have
>>> more infrastructure). But I'd like it to work efficiently for 32-bit
>>> machines. The new function could hardly be less efficient.
>>
>>
>> Again, there's no way in general to make it more efficient. The efficiency
>> issue is directly implied by the DT data structures.
>>
>> In the special case where the parent node is already known, we can certainly
>> introduce an alternate function that is more efficient. You've already asked
>> for that, and as I said, it's in my TODO list.
>>
>>> I think we should consider the case where the physical address size of
>>> U-Boot and the device tree do not match as a corner case. I certainly
>>> don't want device tree to add loads of pointless code for 'normal'
>>> platforms.
>>
>>
>> It's not a corner case. It's a fundamental part of the DT schema. If U-Boot
>> is going to use DT, it should actually use *DT*, not some-very
>> similar-but-not-quite-DT format with all kinds of implicit and unstated
>> exceptions, limitations, and assumptions. This implies fully honoring all
>> aspects of how DT works when parsing it, not picking and choosing features
>> because some are inconvenient or annoying. If U-Boot doesn't want to
>> correctly implement DT support, it should just drop it completely.
>>
>> As an aside, when instantiating devices, I hope one day that U-Boot will
>> parse DT top-down in a hierarchical way, rather than simply searching for
>> any node anywhere in the DT without regard for whether any parent node is
>> enabled, has a driver, has had the driver initialize, etc. U-Boot ignores
>> this right now, and is only getting away with this accidentally. Without
>> hacky workarounds in drivers, this won't continue to work for all Tegra HW
>> (e.g. host1x graphics/display sub-modules, AHUB's audio-related
>> sub-modules), since parent drivers must initialize before child drivers in
>> order to enable various register buses, including clocks/resets affecting
>> those buses etc.
>>
>> Again, if it's simply too inconvenient or bloated to implement DT properly
>> in U-Boot, let's just drop it entirely. A halfway solution is the worst of
>> both worlds. I'm not convinced the full implications of how to (and the need
>> to) correctly and fully support DT have were well thought through before
>> (control) DT support was added to U-Boot.
>>
>>> Re memory-map, yes it doesn't seem to be possible to do what it is
>>> trying to do (and Thierry says the same below). It is quite weird to
>>> have a SPI peripheral which is also memory mapped.
>>>
>>> Here's my question - if you fix the CONFIG_PHYS_64BIT problem with
>>> 64-bit Tegra, what is actually wrong with the way the function was?
>>
>>
>> With the DT files now checked into U-Boot, I think it would accidentally
>> work, since we just happen to have set #address-cells=<2>, #size-cells=<2>,
>> and that would just happen to match sizeof(fdt_addr_t).
>>
>> However note this is an accident on a couple of levels:
>>
>> a) This is because the code assumes that sizeof(fdt_addr_t) ==
>> #address-cells * 4. This is an invalid assumption since it does not
>> correctly honor the DT schema. It hard-codes the size of values whereas DT
>> schema says the size is defined by the #xxx-cells properties.
>>
>> b) The original Tegra210 DTs that TomW posted had #address-cells=<1>,
>> #size-cells=<1>. I asked him to change that to match what I expected to be
>> in the Linux kernel's Tegra210 DTs. However, if he'd rejected my request or
>> I hadn't noticed that, then with CONFIG_PHYS_64BIT set,
>> fdtdec_get_addr_size() would have attempted to read twice as many bytes as
>> it should have from the property. It's entirely plausible that someone could
>> have come along later and realized CONFIG_PHYS_64BIT was set incorrectly and
>> enabled it.
>>
>>> This is a boot loader so we should be willing to make some
>>> simplifications.
>>
>>
>> When dealing with internal bootloader details, sure assumptions,
>> simplifications, etc. can be made.
>>
>> However, DT is an externally defined standard. The content of DT must be
>> identical across all OSs (SW stacks, bootloader) and not influenced by
>> requirements/... of any specific individual OS's (SW stack, bootloader)
>> quirks. We can't just pick and choose which parts of it we care about. Well,
>> perhaps if we stop calling it DT we could.
>
> So I think in summary:
>
> - 64-bit machines should have CONFIG_PHYS_64BIT set correctly

It turns out that arch/arm/include/asm/config.h already enables this for 
all ARM64 platforms.

As such, we can in fact go ahead with reverting this patch, and U-Boot 
will still function on Tegra210 boards.

In the short term, I think that means TomR should just apply this revert 
patch, and we don't need to send any additional patches. In the slightly 
longer term, we should add some comments to fdtdec_get_addr_size() 
describing its problems, and slightly longer term, add back Thierry's 
patch, but in a way that lets callers specify whether 
#address-cells/#size-cells should be used, or whether caller-supplied 
hard-coded values should be used.

I apologize for not noticing this earlier; I'd assumed that since none 
of the Tegra-specific files in include/configs/ set this flag, nor any 
of the Tegra-specific Kconfig files, it wasn't set, and hence a revert 
of the patch would break Tegra210 support.

> - then fdtdec_get_addr_size() would work as expected

I take issue with *works* as expected", since I would expect the 
function to implement DT parsing rules completely.

However, it's certainly true to say that it will generate the desired 
results, even if it doesn't /work/ (isn't implemented) as expected.

(I suppose this depends on whether you're talking about "works" w.r.t to 
correctness of the returned results or side-effects, or w.r.t. how the 
internal implementation works.)

> - I want to make this cases as efficient as possible since it will be
> called in SPL
> - You are concerned that making assumptions like this violates the DT spec
>
> One option is to split the functions into two - one that works in SPL
> and makes assumptions, and one that does not and does things the hard
> way.

Why should SPL be any different? U-Boot should either parse DT correctly 
or not at all. SPL shouldn't be a special case. Admittedly SPL on Tegra 
does very little (and isn't even present on Tegra210), but in general, 
can't someone use storage drivers, filesystems, etc. in SPL to choose 
the next stage to load, read GPIOs or other data sources to select 
between different boot paths, perhaps even interact with a user? If so, 
then assuming that SPL can somehow implements a reduced set of features, 
and hence can make assumptions that non-SPL can't, seems quite 
dangerous. We could only do that if we put some active checks in the 
U-Boot makefiles to ensure that nobody enabled anything in the SPL 
besides a set of strictly audited features.

 > I suppose we could also add checks/warnings that the DT is
> 'well-formed' and that the address size matches the machine it is
> running on.

Yes, we certainly should do that.

> In any case we do need to get rid of the parent lookup in this
> function. So can either you or Thierry submit a patch to do this? The
> parent should instead be a parameter to the function. You may need to
> create a stub function which looks it up before calling
> fdtdec_get_addr_size().

Yes, we can't remove the parent lookup in all cases. However, we can 
avoid it in the cases where the caller can supply it. I think that's 
most, but not quite all.

> I'll see how things look in SPL.
>
> Regards,
> Simon
>