[U-Boot] [RFC PATCH 03/11] SPL: FIT: factor out spl_load_fit_image()

Kever Yang kever.yang at rock-chips.com
Mon Jan 23 03:37:36 CET 2017


Hi Andre,

On 01/22/2017 06:42 PM, André Przywara wrote:
> On 22/01/17 07:16, Kever Yang wrote:
>> Hi Andre,
>>
>> On 01/20/2017 09:53 AM, Andre Przywara wrote:
>>> At the moment we load two images from a FIT image: the actual U-Boot
>>> image and the DTB. Both times we have very similar code to deal with
>>> alignment requirement the media we load from imposes upon us.
>>> Factor out this code into a new function, which we just call twice.
>>>
>>> Signed-off-by: Andre Przywara <andre.przywara at arm.com>
>>> ---
>>>    common/spl/spl_fit.c | 122
>>> +++++++++++++++++++++------------------------------
>>>    1 file changed, 51 insertions(+), 71 deletions(-)
>>>
>>> diff --git a/common/spl/spl_fit.c b/common/spl/spl_fit.c
>>> index 381ed1f..d4149c5 100644
>>> --- a/common/spl/spl_fit.c
>>> +++ b/common/spl/spl_fit.c
>>> @@ -138,19 +138,58 @@ static int get_aligned_image_size(struct
>>> spl_load_info *info, int data_size,
>>>        return (data_size + info->bl_len - 1) / info->bl_len;
>>>    }
>>>    +static int spl_load_fit_image(struct spl_load_info *info, ulong
>>> sector,
>>> +                  void *fit, ulong base_offset, int node,
>>> +                  struct spl_image_info *image_info)
>>> +{
>>> +    ulong offset;
>>> +    size_t length;
>>> +    ulong load, entry;
>>> +    void *src;
>>> +    ulong overhead;
>>> +    int nr_sectors;
>>> +
>>> +    offset = fdt_getprop_u32(fit, node, "data-offset") + base_offset;
>>> +    length = fdt_getprop_u32(fit, node, "data-size");
>>> +    load = fdt_getprop_u32(fit, node, "load");
>>> +    if (load == -1U && image_info)
>>> +        load = image_info->load_addr;
> 	    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>> +    entry = fdt_getprop_u32(fit, node, "entry");
>>> +
>>> +    overhead = get_aligned_image_overhead(info, offset);
>>> +    nr_sectors = get_aligned_image_size(info, overhead + length,
>>> offset);
>>> +
>>> +    if (info->read(info, sector + get_aligned_image_offset(info,
>>> offset),
>>> +               nr_sectors, (void*)load) != nr_sectors)
>>> +        return -EIO;
>>> +    debug("image: dst=%lx, offset=%lx, size=%lx\n", load, offset,
>>> +          (unsigned long)length);
>>> +
>>> +    src = (void *)load + overhead;
>>> +#ifdef CONFIG_SPL_FIT_IMAGE_POST_PROCESS
>>> +    board_fit_image_post_process(&src, &length);
>>> +#endif
>>> +
>>> +    memcpy((void*)load, src, length);
>> For FIT image, we read the first block for header, and then we know the
>> offset
>> and size for images, is it possible to have an option that we can read
>> the image
>> data to load address directly without memcpy(), this helps speed up the
>> boot time.
> We do read directly into the load address (see the marked place above).
> But we have to fix up the alignment:
> a) The load address may not be cache line aligned. I am not totally
> convinced we really need this, but the code takes care of this.
> Shouldn't be an issue in practice since most load addresses are already
> aligned.
> b) More importantly we need to observe the block size of the device we
> read from. mkimage packs the images as tightly as possible into the
> image, so ATF may start at say offset 0x456 within the FIT image. Given
> that the FIT image itself starts at a sector boundary, that may be "in
> the middle" of a sector, which is the smallest addressable unit for
> block devices. For MMC based storage for instance this is typically 512
> bytes. So we read the whole sector and later discard the overhead by
> copying the image back in memory.
>
> Questions to you:
> 1) Is the memcpy really an issue? I take it you use bl31.bin only, which
> is probably between 32KB and 64KB. So does the memcpy really contribute
> to boot time here?
> 2) What device do you usually read from? What is the alignment there?
> For instance SPI flash has typically an alignment of 1 Byte, so no
> adjustment is actually needed. The memcpy routine bails out if dst==src,
> so in this case there are no cycles spend on this.
> 3) Can you tweak mkimage to observe alignment? As I said above, in the
> moment it packs it tightly, but I don't see why we couldn't align
> everything in the image at the cost of loosing 511 Bytes per embedded
> image in the worst case, for instance. This would eventually fold into
> the same point as question 2): If everything is aligned already, memcpy
> is a NOP.
>
> So one relatively easy and clean solution would be to introduce an
> alignment command line option to mkimage. One could optionally specify a
> value, to which mkimage would align each embedded images into, leaving
> gaps in between. We would still go over all the calculations in the SPL
> code, but all the offsets and corrections would end up being 0, so no
> copying would be necessary anymore.

This looks good to me, I didn't notice that there is no copy if dst==src.

Reviewed-by: Kever Yang<kever.yang at rock-chips.com>

Thanks,
- Kever
>
> Does this sound like a plan?
>
> Cheers,
> Andre.
>
>>> +
>>> +    if (image_info) {
>>> +        image_info->load_addr = load;
>>> +        image_info->size = length;
>>> +        image_info->entry_point = entry;
>>> +    }
>>> +
>>> +    return 0;
>>> +}
>>> +
>>>    int spl_load_simple_fit(struct spl_image_info *spl_image,
>>>                struct spl_load_info *info, ulong sector, void *fit)
>>>    {
>>>        int sectors;
>>> -    ulong size, load;
>>> +    ulong size;
>>>        unsigned long count;
>>> +    struct spl_image_info image_info;
>>>        int node, images;
>>> -    void *load_ptr;
>>> -    int fdt_offset, fdt_len;
>>> -    int data_offset, data_size;
>>>        int base_offset, align_len = ARCH_DMA_MINALIGN - 1;
>>> -    int src_sector;
>>> -    void *dst, *src;
>>>          /*
>>>         * Figure out where the external images start. This is the base
>>> for the
>>> @@ -202,82 +241,23 @@ int spl_load_simple_fit(struct spl_image_info
>>> *spl_image,
>>>            return -1;
>>>        }
>>>    -    /* Get its information and set up the spl_image structure */
>>> -    data_offset = fdt_getprop_u32(fit, node, "data-offset");
>>> -    data_size = fdt_getprop_u32(fit, node, "data-size");
>>> -    load = fdt_getprop_u32(fit, node, "load");
>>> -    debug("data_offset=%x, data_size=%x\n", data_offset, data_size);
>>> -    spl_image->load_addr = load;
>>> -    spl_image->entry_point = load;
>>> +    /* Load the image and set up the spl_image structure */
>>> +    spl_load_fit_image(info, sector, fit, base_offset, node, spl_image);
>>>        spl_image->os = IH_OS_U_BOOT;
>>>    -    /*
>>> -     * Work out where to place the image. We read it so that the first
>>> -     * byte will be at 'load'. This may mean we need to load it starting
>>> -     * before then, since we can only read whole blocks.
>>> -     */
>>> -    data_offset += base_offset;
>>> -    sectors = get_aligned_image_size(info, data_size, data_offset);
>>> -    load_ptr = (void *)load;
>>> -    debug("U-Boot size %x, data %p\n", data_size, load_ptr);
>>> -    dst = load_ptr;
>>> -
>>> -    /* Read the image */
>>> -    src_sector = sector + get_aligned_image_offset(info, data_offset);
>>> -    debug("Aligned image read: dst=%p, src_sector=%x, sectors=%x\n",
>>> -          dst, src_sector, sectors);
>>> -    count = info->read(info, src_sector, sectors, dst);
>>> -    if (count != sectors)
>>> -        return -EIO;
>>> -    debug("image: dst=%p, data_offset=%x, size=%x\n", dst, data_offset,
>>> -          data_size);
>>> -    src = dst + get_aligned_image_overhead(info, data_offset);
>>> -
>>> -#ifdef CONFIG_SPL_FIT_IMAGE_POST_PROCESS
>>> -    board_fit_image_post_process((void **)&src, (size_t *)&data_size);
>>> -#endif
>>> -
>>> -    memcpy(dst, src, data_size);
>>> -
>>>        /* Figure out which device tree the board wants to use */
>>>        node = spl_fit_get_image_node(fit, images, FIT_FDT_PROP, 0);
>>>        if (node < 0) {
>>>            debug("%s: cannot find FDT node\n", __func__);
>>>            return node;
>>>        }
>>> -    fdt_offset = fdt_getprop_u32(fit, node, "data-offset");
>>> -    fdt_len = fdt_getprop_u32(fit, node, "data-size");
>>>          /*
>>> -     * Read the device tree and place it after the image. There may be
>>> -     * some extra data before it since we can only read entire blocks.
>>> -     * And also align the destination address to ARCH_DMA_MINALIGN.
>>> +     * Read the device tree and place it after the image.
>>> +     * Align the destination address to ARCH_DMA_MINALIGN.
>>>         */
>>> -    dst = (void *)((load + data_size + align_len) & ~align_len);
>>> -    fdt_offset += base_offset;
>>> -    sectors = get_aligned_image_size(info, fdt_len, fdt_offset);
>>> -    src_sector = sector + get_aligned_image_offset(info, fdt_offset);
>>> -    count = info->read(info, src_sector, sectors, dst);
>>> -    debug("Aligned fdt read: dst %p, src_sector = %x, sectors %x\n",
>>> -          dst, src_sector, sectors);
>>> -    if (count != sectors)
>>> -        return -EIO;
>>> -
>>> -    /*
>>> -     * Copy the device tree so that it starts immediately after the
>>> image.
>>> -     * After this we will have the U-Boot image and its device tree
>>> ready
>>> -     * for us to start.
>>> -     */
>>> -    debug("fdt: dst=%p, data_offset=%x, size=%x\n", dst, fdt_offset,
>>> -          fdt_len);
>>> -    src = dst + get_aligned_image_overhead(info, fdt_offset);
>>> -    dst = load_ptr + data_size;
>>> -
>>> -#ifdef CONFIG_SPL_FIT_IMAGE_POST_PROCESS
>>> -    board_fit_image_post_process((void **)&src, (size_t *)&fdt_len);
>>> -#endif
>>> -
>>> -    memcpy(dst, src, fdt_len);
>>> +    image_info.load_addr = spl_image->load_addr + spl_image->size;
>>> +    spl_load_fit_image(info, sector, fit, base_offset, node,
>>> &image_info);
>>>          return 0;
>>>    }
>>
>
>
>




More information about the U-Boot mailing list