[U-Boot] [PATCH V3 1/3] drivers: block: add block device cache

Eric Nelson eric at nelint.com
Sat Apr 2 01:16:42 CEST 2016


Hi Stephen,

On 04/01/2016 03:57 PM, Stephen Warren wrote:
> On 03/31/2016 02:24 PM, Eric Nelson wrote:
>> On 03/30/2016 02:57 PM, Stephen Warren wrote:
>>> On 03/30/2016 11:34 AM, Eric Nelson wrote:
>>>> On 03/30/2016 07:36 AM, Stephen Warren wrote:
>>>>> On 03/28/2016 11:05 AM, Eric Nelson wrote:

<snip>

>>>
>>> We could allocate the data storage for the block cache at the top of RAM
>>> before relocation, like many other things are allocated, and hence not
>>> use malloc() for that.
>>
>> Hmmm. We seem to have gone from a discussion about data structures to
>> type of allocation.
>>
>> I'm interested in seeing how that works. Can you provide hints about
>> what's doing this now?
> 
> Something like common/board_f.c:reserve_mmu() and many other functions
> there. relocaddr starts at approximately the top of RAM, continually
> gets adjusted down as many static allocations are reserved, and
> eventually becomes the address that U-Boot is relocated to. Simply
> adding another entry into init_sequence_f[] for the disk cache might work.
> 

Thanks for the pointer. I'll review that when time permits.

This would remove the opportunity to re-configure the cache though, right?

I'm not sure whether how important this feature is, and I think
only time and use will tell.

I'd prefer to keep that ability at least for a cycle or two so that
I and others can test.

>>>> While re-working the code, I also thought more about using an array and
>>>> still don't see how the implementation doesn't get more complex.
>>>>
>>>> The key bit is that the list is implemented in MRU order so
>>>> invalidating the oldest is trivial.
>>>
>>> Yes, the MRU logic would make it more complex. Is that particularly
>>> useful, i.e. is it an intrinsic part of the speedup?
>>
>> It's not a question of speed with small numbers of entries. The code
>> to handle eviction would just be more complex.
> 
> My thought was that if the eviction algorithm wasn't important (i.e.
> most of the speedup comes from have some (any) kind of cache, but the
> eviction algorithm makes little difference to the gain from having the
> cache), we could just drop MRU completely. If that's not possible, then
> indeed a list would make implementing MRU easier.
> 

How would we decide which block to discard? I haven't traced enough
to know what algorithm(s) might be best, but I can say that there's
a preponderance of repeated accesses to the last-accessed block,
especially in ext4.

> You could still do a list with a statically allocated set of list nodes,
> especially since the length of the list is bounded.
> 

Sure. A pooled allocator (pool of free nodes) works well with
array-based allocation.

Having a fixed upper limit on the number of blocks would require
additional checking unless we just sized it for (max entries * max
blocks/entry).

>> Given that the command "blkcache configure 0 0" will discard all
>> cache and since both dfu and ums should properly have the cache
>> disabled, I'd like to proceed as-is with the list and heap approach.
>
> I don't understand "since both dfu and ums should properly have the
> cache disabled"; I didn't see anything that did that. Perhaps you're
> referring to the fact that writes invalidate the cache?
> 

Yes, but also that the host will cache blocks in the ums case, so
having the cache enabled will only slow things down slightly by
lots of memcpy's to cached blocks that won't be helpful.

I think I was a bit flippant by including dfu in this statement,
since I haven't used it to access anything except SPI-NOR.

> Eventually it seems better to keep the cache enabled for at least DFU to
> a filesystem (rather than raw block device) since presumably parsing the
> directory structure to write to a file for DFU would benefit from the
> cache just like anything else.

I'm not in a position to comment about dfu.

Regards,


Eric


More information about the U-Boot mailing list