[U-Boot] [PATCH v2 10/11] efi_loader: Pass address to fs_read()

Thu Jun 21 02:02:27 UTC 2018

Hi Alex,

On 20 June 2018 at 03:31, Alexander Graf <agraf at suse.de> wrote:
> On 06/20/2018 12:03 AM, Simon Glass wrote:
>>
>> Hi Alex,
>>
>> On 14 June 2018 at 15:55, Alexander Graf <agraf at suse.de> wrote:
>>>
>>>
>>> On 14.06.18 23:35, Simon Glass wrote:
>>>>
>>>> Hi Alex,
>>>>
>>>> On 14 June 2018 at 13:51, Alexander Graf <agraf at suse.de> wrote:
>>>>>
>>>>>
>>>>> On 14.06.18 21:01, Simon Glass wrote:
>>>>>>
>>>>>> On 14 June 2018 at 12:22, Alexander Graf <agraf at suse.de> wrote:
>>>>>>>
>>>>>>> The fs_read() function wants to get an address rather than the
>>>>>>> pointer to a buffer.
>>>>>>>
>>>>>>> So let's convert the passed buffer from pointer back a the address
>>>>>>> to make efi_loader on sandbox happier.
>>>>>>>
>>>>>>> Signed-off-by: Alexander Graf <agraf at suse.de>
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> v1 -> v2:
>>>>>>>
>>>>>>>    - Clarify address vs pointer
>>>>>>>    - include mapmem.h
>>>>>>> ---
>>>>>>>   lib/efi_loader/efi_file.c | 5 ++++-
>>>>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> Reviewed-by: Simon Glass <sjg at chromium.org>
>>>>>>
>>>>> I actually think that this patch tackles the problem the wrong way
>>>>> around. I've cooked up another one that converts fs_read() and
>>>>> fs_write() to instead take a pointer - which really is what most users
>>>>> of the API want in the first place:
>>>>>
>>>>>
>>>>>
>>>>> https://github.com/agraf/u-boot/commit/eb89f036a42cea8d7aaa6d83b8ecd9d202814b0f
>>>>
>>>> Actually this is a pretty good exposition of why we don't want to use
>>>> pointers everywhere. U-Boot uses addresses all over the place. Even a
>>>> search for something as specific as "ulong addr" gets over 200
>>>> matches. If we go down this path we will end up changing a pretty
>>>> fundamental part of U-Boot, and I don't think it is a good idea to do
>>>> that. Addresses are easy to understand, can be added/subtracted
>>>> without pointer arithmetic, can be easily converted to pointers as
>>>> needed, can be 32-bits long on sandbox, etc.
>>>
>>> I tend to disagree with this. Most code in U-Boot actually consumes
>>> pointers rather than addresses.
>>
>> That might be true within individual components, but addresses are the
>> common currency in U-Boot. See for example bootm, all the
>> image-handling stuff including FIT, commands, device tree address
>> properties. Pointers are more often used at the lowest level of code.
>
>
> Yes, but all counterparts of fs_read and APIs on the same level use
> pointers. And please take a few minutes to look at its users. 90% of them
> are using pointers too.

Do you mean the callers of fs_read()? That's not the point. Of course
there will be conversion from addresses to pointers, but I am trying
to keep those at the 'edge', away from the core U-Boot code. Your
patch is basically designed to avoid a pointer conversion (which you
can't support) in the filesystem layer. Next you will be doing this in
networking, display, etc. etc. There is just no need for this change.

>
> As I mentioned elsewhere already, going from address to pointer is something
> reasonable. Going from address to pointer usually a sign of bad design ;).
> If most users of fs_read() are already using pointers, it means that either
> all of its users are wrong or fs_read() is wrong. I think the latter is the
> case.

By that logic you would have U-Boot not use addresses at all, and have
pointers everywhere. It just isn't the way the code base is today.

>
>
>>
>>>> So I think we should step back from the abyss here and stick with the
>>>> way sandbox currently deals with addresses. It works well, is pretty
>>>> easy to understand, allows debugging which is just as easy on sandbox
>>>> as other archs (since they both uses addresses until basically the
>>>> final access), the addresses are typically small values that start at
>>>> 0 which much is easier to read than (e.g.) 00007f1b58c8b008.
>>>>
>>>> Here for example is the output from starting U-Boot with debugging in
>>>> board_f.c (something I have turned on a lot when refactoring and
>>>> debugging the init sequence):
>>>>
>>>> U-Boot 2018.07-rc1-00142-g134ea86c7f-dirty (Jun 14 2018 - 15:04:04
>>>> -0600)
>>>>
>>>> DRAM:  Monitor len: 00395AB0
>>>> Ram size: 08000000
>>>> Ram top: 08000000
>>>> Reserving 3670k for U-Boot at: 07c6a000
>>>> Reserving 32776k for malloc() at: 05c68000
>>>> Reserving 120 Bytes for Board Info at: 05c67f88
>>>> Reserving 472 Bytes for Global Data at: 05c67db0
>>>> Reserving 4352 Bytes for FDT at: 05c66cb0
>>>> Reserving 0x3c8 Bytes for bootstage at: 05c668e8
>>>>
>>>> RAM Configuration:
>>>> Bank #0: 0 128 MiB
>>>>
>>>> DRAM:  128 MiB
>>>> New Stack Pointer is: 05c668d0
>>>> Copying bootstage from 00007fdb0056e038 to 00007fdb061c48f0, size 3c8
>>>> Relocation Offset is: 07c6a000
>>>> Relocating to 07c6a000, new gd at 05c67db0, sp at 05c668d0
>>>>
>>>>
>>>> If we use pointers we get things like this:
>>>>
>>>> Reserving 3670k for U-Boot at: 00007f1b58c8b008
>>>>
>>>> I just don't want to deal with 64-bit addresses when debugging things,
>>>> and there really is no point. The map_sysmem() function provides a
>>>> nice and easy way to cast an address to a pointer, and it works on all
>>>> architectures.
>>>
>>> Ok, circling back to square 1. The main issue we're facing here is that
>>> most code in U-Boot does not really understand the difference between
>>> virtual and physical addresses - it just simply assumes a 1:1 map.
>>
>> Traditionally that was true, but over the past 5 years quite a bit of
>> code has been converted to work correctly with sandbox. Also, almost
>> all *tested* code supports sandbox.
>
>
> I think the difference is between infrastructure and device code. Most
> device code is still assuming 1:1 maps, but just doesn't get executed in
> sandbox.

If you mean the drivers, then yes I agree. The drivers typically get
an address and map it to a pointer which they use from then on.

I do have some ideas about executing driver code in sandbox. We
already build a lot of drivers in sandbox, but execute only a very
few. But sandbox could absolutely provide a way to test driver code,
with suitable plumbing. Of course, we need address mapping...:-)

>
>>
>>> With sandbox, we do not have a 1:1 map anymore - any address is
>>> somewhere else than its pointer.
>>>
>>> We have a few options:
>>>
>>>    1) Deal with pointers at random addresses throughout the code
>>>
>>> I can see why you don't want this. I find ASLR generated addresses quite
>>> cumbersome to read as well.
>>
>> I'm not sure what this means.
>
>
> Address Space Layout Randomization. It's what causes these weird looking
> addresses ;).

No, I mean, I am not sure what your comment means. I know what ASLR
is, but I just don't understand your comment as a whole. What problem
are you trying to solve?

>
>>
>>>    2) Explicitly map RAM at VA 0x10000000, then do 1:1 map
>>>
>>> This would be the best of all worlds still IMHO. That way we will have
>>> easily readable addresses (that are identical in 32bit and 64bit), but
>>> can still do a 1:1 map. The only thing we need to do is to introduce a
>>> linker section at 0x10000000.
>>
>> Linker section or mmap() is essentially the same thing. The former
>> will presumably just fail to run (bus error?) if the address is not
>> available. The latter will select a similar, available address.
>
>
> No, they're inherently different things. When a process starts, its virtual
> address space is basically empty (apart from the vdso and a tiny chunk of
> ld). The linker is then responsible to ensure that all application defined
> memory regions are mapped. Only after that happened, it will map linked
> libraries (such as glibc, SDL, whatever) at random places in address space.
>
> So with a linker section, we're basically guaranteed to always have memory
> live at the same spot. With mmap() there is a good chance we're overlapping
> with something that happened to get mapped there in between.

I don't think so. The only calls to mmap() in U-Boot are in os.c so we
can control use of this <4GB address. From my limited testing with
64-bit machines I always get a huge address for things where I don't
specify a hint address.

>
>>
>>>    3) Keep converting addresses to pointers
>>>
>>> I'm afraid if we follow this path, we will always have arguments on
>>> where the correct boundaries are. If you start to enable debugging in
>>> core dm code and print out pointers to your dm objects, you will get
>>> pointer values today as well. Sooner or later we will always end up with
>>> pointers.
>>
>> The addresses should generally be converted just before use. Most APIs
>> should use addresses rather than pointers, particularly those related
>> to images, loaded data in memory, hardware peripheral addresses and
>> booting.
>
>
> I don't think that's a reasonable definition for the API requirements to be
> honest. I think addresses make a lot of sense as long as you don't want to
> touch any data inside. You basically use them like a token. Once you've gone
> past that point - so you've run map_sysmem() on that address - we should
> stay in pointer land IMHO.

Again, by that logic we would never uses addresses in U-Boot.
Everything would be a pointer. Addresses have a number of advantages
over pointers:

- easy to do arithmetic
- don't need casts
- easy to printf() without getting massive 16-digit hex strings all the time
- can be 32-bits wide in sandbox even on a 64-bit machine

Probably some others...you yourself said you cannot do pointer
arithmetic without casting to a uintptr_t first. I don't think that is
true, but it is indicative of the messiness of dealing with pointers.

Regards,
Simon