[U-Boot] [PATCH v0 21/20] efi_loader: hack for archs that cannot do unaligned accesses

Ard Biesheuvel ard.biesheuvel at linaro.org
Tue Aug 8 08:11:14 UTC 2017


On 8 August 2017 at 07:52, Alexander Graf <agraf at suse.de> wrote:
>
>
>> Am 07.08.2017 um 23:18 schrieb Rob Clark <robdclark at gmail.com>:
>>
>> On Mon, Aug 7, 2017 at 5:14 PM, Mark Kettenis <mark.kettenis at xs4all.nl> wrote:
>>>> From: Alexander Graf <agraf at suse.de>
>>>> Date: Mon, 7 Aug 2017 21:19:37 +0100
>>>>
>>>>> On 05.08.17 21:31, Rob Clark wrote:
>>>>>> On Sat, Aug 5, 2017 at 4:05 PM, Heinrich Schuchardt <xypron.glpk at gmx.de> wrote:
>>>>>>> On 08/05/2017 08:43 PM, Rob Clark wrote:
>>>>>>>> On Sat, Aug 5, 2017 at 1:06 PM, Rob Clark <robdclark at gmail.com> wrote:
>>>>>>>>> On Sat, Aug 5, 2017 at 12:52 PM, Heinrich Schuchardt <xypron.glpk at gmx.de> wrote:
>>>>>>>>>> On 08/05/2017 06:16 PM, Rob Clark wrote:
>>>>>>>>>>> On Sat, Aug 5, 2017 at 12:12 PM, Heinrich Schuchardt <xypron.glpk at gmx.de> wrote:
>>>>>>>>>>>> On 08/05/2017 05:58 PM, Rob Clark wrote:
>>>>>>>>>>>> Some arch's have trouble with unaligned accesses.  Technically
>>>>>>>>>>>> EFI device-path structs should be byte aligned, and the next node
>>>>>>>>>>>> in the path starts immediately after the previous.  Meaning that
>>>>>>>>>>>> a pointer to an 'struct efi_device_path' is not necessarily word
>>>>>>>>>>>> aligned.  See section 10.3.1 in v2.7 of UEFI spec.
>>>>>>>>>>>>
>>>>>>>>>>>> This causes problems not just for u-boot, but also most/all EFI
>>>>>>>>>>>> payloads loaded by u-boot on these archs.  Fortunately the common
>>>>>>>>>>>> practice for traversing a device path is to rely on the length
>>>>>>>>>>>> field in the header, rather than the specified length of the
>>>>>>>>>>>> particular device path type+subtype.  So the EFI_DP_PAD() macro
>>>>>>>>>>>> will add the specified number of bytes to the tail of device path
>>>>>>>>>>>> structs to pad them to word alignment.
>>>>>>>>>>>>
>>>>>>>>>>>> Technically this is non-compliant, BROKEN_UNALIGNED should *only*
>>>>>>>>>>>> be defined on archs that cannot do unaligned accesses.
>>>>>>>>>>>>
>>>>>>>>>>>> Signed-off-by: Rob Clark <robdclark at gmail.com>
>>>>>>>>>>>> ---
>>>>>>>>>>>> I'm not sure if there are other arch's that need -DBROKEN_UNALIGNED
>>>>>>>>>>>>
>>>>>>>>>>>> Mark, this is untested but I think it should solve your crash on the
>>>>>>>>>>>> Banana Pi.  Could you give it a try when you get a chance?
>>>>>>>>>>>>
>>>>>>>>>>>>  arch/arm/config.mk               |  2 +-
>>>>>>>>>>>>  include/efi_api.h                | 30 ++++++++++++++++++++++++++++++
>>>>>>>>>>>>  lib/efi_loader/efi_device_path.c |  3 +++
>>>>>>>>>>>>  3 files changed, 34 insertions(+), 1 deletion(-)
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/arch/arm/config.mk b/arch/arm/config.mk
>>>>>>>>>>>> index 1a77779db4..067dc93a9d 100644
>>>>>>>>>>>> --- a/arch/arm/config.mk
>>>>>>>>>>>> +++ b/arch/arm/config.mk
>>>>>>>>>>>> @@ -28,7 +28,7 @@ LLVMS_RELFLAGS              := $(call cc-option,-mllvm,) \
>>>>>>>>>>>>                       $(call cc-option,-arm-use-movt=0,)
>>>>>>>>>>>>  PLATFORM_RELFLAGS    += $(LLVM_RELFLAGS)
>>>>>>>>>>>>
>>>>>>>>>>>> -PLATFORM_CPPFLAGS += -D__ARM__
>>>>>>>>>>>> +PLATFORM_CPPFLAGS += -D__ARM__ -DBROKEN_UNALIGNED
>>>>>>>>>>>
>>>>>>>>>>> NAK
>>>>>>>>>>>
>>>>>>>>>>> We have more then ARM. And other architectures also create exceptions
>>>>>>>>>>> for unaligned access.
>>>>>>>>>>>
>>>>>>>>>>> I hate platform specific code. It should not be used outside /arch.
>>>>>>>>>>>
>>>>>>>>>>> To play it save you should not use _packed at all!
>>>>>>>>>>> Use memcpy to transfer between aligned and unaligned memory.
>>>>>>>>>>
>>>>>>>>>> except for reasons I explained in the thread on the patch that added
>>>>>>>>>> the __packed in the first place.  Sorry, this is ugly but we have to
>>>>>>>>>> do it.
>>>>>>>>>>
>>>>>>>>>> BR,
>>>>>>>>>> -R
>>>>>>>>>
>>>>>>>>> According to the UEFI standard the nodes in paths are not to be assumed
>>>>>>>>> to be aligned.
>>>>>>>>>
>>>>>>>>> So even if you use padding bytes in paths that you pass to the EFI
>>>>>>>>> application you should not expect that the EFI application does the
>>>>>>>>> same. Expect the EFI application either to remove them or send new nodes
>>>>>>>>> without padding.
>>>>>>>>>
>>>>>>>>> To the idea of padding bytes and __packed does not make sense.
>>>>>>>>
>>>>>>>> Ok, to be fair, you are right about device-paths passed too u-boot.
>>>>>>>> On BROKEN_UNALIGNED archs, we should sanitise with a copy to an
>>>>>>>> aligned device-path in *addition* to what I proposed.  I can make a
>>>>>>>> patch to add a helper to do this a bit later.
>>>>>>>
>>>>>>> so thinking about this a bit, I have two options in mind:
>>>>>>>
>>>>>>>  + efi_dp_sanitize() + efi_dp_free() helpers that are no-ops on
>>>>>>>    archs that can do unaligned access, but efi_dp_sanitize() always
>>>>>>>    allocates and copies on BROKEN_UNALIGNED archs and efi_dp_free()
>>>>>>>    always free's on BROKEN_UNALIGNED archs, even if the dp passed
>>>>>>>    from efi payload doesn't require it.
>>>>>>>
>>>>>>>  + efi_dp_sanitize() that is no-op on archs that can do unaligned
>>>>>>>    access but only allocates/copies when passed a device path that
>>>>>>>    would result in unaligned access, plus hook some mechanism to
>>>>>>>    auto-free on EFI_EXIT().. which is slightly tricky w/ nesting of
>>>>>>>    efi calls, but not impossible and at least avoids the problem
>>>>>>>    of missing calls to free the dup'd device-path.. which is the
>>>>>>>    sort of leak I might miss since I'm not using EFI_LOADER on any
>>>>>>>    BROKEN_UNALIGNED arch
>>>>>>>
>>>>>>> anyone who cares about armv7 + efi have a preference between always
>>>>>>> doing an extra alloc+copy+free vs EFI_EXIT() magic?
>>>>>>>
>>>>>>
>>>>>> Please, go for the simplest code.
>>>>>
>>>>> well, the source of my question was there are two definitions of
>>>>> "simplest" here, ie "simplest to use" and "simplest helpers".. I'm
>>>>> mostly worried about future patches that use the helpers introducing
>>>>> leaks.  But actually I think a reasonable middle ground is "simplest
>>>>> helpers plus always make the helpers not no-ops for DEBUG builds"..
>>>>>
>>>>>> I cannot imagine that copying takes more than 10ms for starting grub on
>>>>>> any architecture. So the user will not notice it anyway.
>>>>>> Assume every architecture requiring alignment for which you do not have
>>>>>> proof of the contrary.
>>>>>>
>>>>>> Even on ARM64 there are op codes that fail without alignment.
>>>>>
>>>>> iirc the restrictions where mostly about device memory, and atomics.
>>>>> Which should not apply here.  (And we have aarch64 servers in
>>>>> production with grub, which doesn't take any protections about
>>>>> unaligned device-path nodes, using non u-boot EFI implementations.  So
>>>>> maybe that counts as proof to the contrary ;-))
>>>>
>>>> The UEFI spec mandates that unaligned access are fixed up properly
>>>> (usually by hardware). We violate the spec on 32bit ARM, but do fulfill
>>>> it on AArch64.
>>>
>>> Actually for AArch32 the spec says:
>>>
>>>  Unaligned access should be enabled if supported; Alignment faults
>>>  are enabled otherwise.
>>
>> It is a bit strange to me that alignment faults are disabled by
>> default, but presumably is disabled out of reset..  I don't totally
>> understand the connection to having mmu disabled, unless all memory is
>> treated like device memory without mmu enabled.  (Which, also,
>> wouldn't that be super-slow?)  But anyway, I guess that is a bit
>> beside the point.
>
> It is treated as device memory with mmu disabled, yes.
>

Yes, and this is really the only sane thing that can be done, given
that the page tables not only contain VA/PA translations (which I
guess U-boot, like UEFI, does not use), but also the memory types of
the various regions. So without page tables, the CPU has no clue which
accesses it can cache (to memory) and which accesses it cannot (to
devices).

Actually, my recommendation is to always run non-trivial code (i.e., C
code) with caching enabled. In UEFI, we do rely on -mno-unaligned
access (or -mstrict-align) to ensure the C code does not get optimized
to use uncached accesses, but as Alex pointed out, this is not a
complete solution.

>>
>>> So UEFI payload that wants to support pre-ARMv7 hardware should make
>>> sure it doesn't do any unaligned access.
>>>
>>>> The reason things worked before really was just that both U-Boot and
>>>> grub were compiled with -mno-unaligned-access, so they never issued
>>>> unaligned accesses.
>>>
>>> That in itself is not enough to prevent unaligned access.  If you
>>> explicitly cast an unaligned pointer to say (uint32_t *) and then
>>> dereference it, you'll create an unaligned access.
>>
>> This is problematic around file nodes in the device path.  Adding the
>> padding bytes to the end of each device-path struct would "solve"
>> that, and if pre-aarch64 we are aiming at "good enough to work", I
>> kinda think that this the approach we should go for.  Other than
>> file-path nodes, the rest of the issues in u-boot should be solved by
>> addition of missing __packed on 'struct efi_device_path' (which I've
>> added locally and will be in the next revision of the patchset).
>
> Let's ask Leif and Ard if that is actually correct. If I remember correctly, edk2 some times leverages automatic padding from the compiler on structs.
>

I guess that that might work, if U-boot is the only agent
instantiating device path structures. But what do you mean by
'correct' in this context?

>>
>>>> Which system broke for Mark? Banana pi mostly has 32bit SBCs, so I
>>>> assume it's one of those? In that case, we would need to either
>>>>
>>>>   1) hack up the openbsd loader to also get compiled with
>>>> -mno-unaligned-access
>>>>
>>>> or
>>>>
>>>>   2) rework the 32bit arm mmu code to enable the MMU plus hardware
>>>> unaligned fixups on all systems. (that's a *lot* of work)
>>>
>>> The system that broke was Cortex-A7 based, so ARMv7.
>>>
>>> *However*, the OpenBSD bootloader does not perform any unaligned
>>> access.  *All* cases of unaligned access that I encountered while
>>> testing/debugging Rob's code were in U-Boot itself.  This was mostly a
>>> consequence from a diff in the series that added __packed to the
>>> device path node structs and then passed members of those structs to
>>> functions that accessed the contents of tose (now byte aligned
>>> structs) through (uint16_t *) or (uint32_t *) pointers.
>>
>> Yeah.. the padding byte approach I suggested would solve this.  Not
>> *really* compliant, but should work for all the payload code I've come
>> across.  We definitely shouldn't enable the padding bytes on aarch64,
>> but I think padding bytes would be a pragmatic solution for armv7 and
>> earlier.  (And hopefully someday someone enables mmu on armv7 so we
>> can turn off the hack there too.)
>
> The problem really is inconsistency. Some 32bit arm platforms enable the mmu, some don't.
>
>>
>>> There was initially some confusion because I used the pc reported by
>>> U-Boot to look up the faulting instruction whereas I should have been
>>> using the reloc_pc.
>>>
>>> I think Rob is worried about grub/shim/fallback.efi doing unaligned
>>> access of device path nodes on armv7.  But I'm not sure he has any
>>> evidence beyond looking at code.
>>
>> It will definitely be an issue (if grub/shim/fallback were not
>> compiled with -mno-unaligned-access) in some of the debug code that
>> prints out device-paths.  I haven't checked the compiler flags used w/
>> shim/fallback.
>>
>> Parsing MEDIA_DEVICE:HARD_DRIVE nodes would be an issue regardless
>> without -mno-unaligned-access, although we might be getting lucky by
>> just not hitting debug paths.  Some of the other issues would be
>> hidden by padding bytes at the end of the DP structs which would keep
>> the file-path nodes word aligned.
>>
>>>> To me personally, UEFI in 32bit ARM land is a nice to have standardized
>>>> platform, but not something where we really need to be fully standards
>>>> compliant in every aspect. I think "making things work" is good enough
>>>> there.
>>>
>>> Right.  For us UEFI us just a convenient way to re-use the U-Boot
>>> device driver code to load a kernel from a UFS/FFS filesystem.
>>>
>>>> For AArch64 things are different. There we should strive for full UEFI
>>>> compliance as it's "the future" :).
>>>
>>> Even there just making things work would be good enough for me ;).
>>> Our AArach64 bootloader is almost identical to the AArch32 one and
>>> works on a real UEFI implementation as well (SoftIron Overdrive 1000).
>>>
>>
>> I think we should make the aarch64 implementation fully compliant (ie.
>> addition of missing __packed's and no extra padding bytes).  I won't
>> loose sleep if some efi payloads don't work on pre-aarch64 (we should
>> be able to keep things that were working before working).  If adding
>> missing __packed breaks things on platforms that can't do unaligned
>> access, the solution is not to remove __packed, but to conditionally
>> add padding bytes at the end of the struct.  That way we keep things
>> working as before on the old platforms, but make things work correctly
>> on aarch64.
>>
>> I'll add back my patch for EFI_DP_PAD() to this patchset, since this
>> seems like the sensible way forward.
>
> Let's make sure we get an ack from Leif/Ard for that approach :)
>
>
> Alex
>
>


More information about the U-Boot mailing list