[U-Boot] [PATCH V3 1/3] drivers: block: add block device cache
Stephen Warren
swarren at wwwdotorg.org
Sat Apr 2 04:07:23 CEST 2016
On 04/01/2016 05:16 PM, Eric Nelson wrote:
> Hi Stephen,
>
> On 04/01/2016 03:57 PM, Stephen Warren wrote:
>> On 03/31/2016 02:24 PM, Eric Nelson wrote:
>>> On 03/30/2016 02:57 PM, Stephen Warren wrote:
>>>> On 03/30/2016 11:34 AM, Eric Nelson wrote:
>>>>> On 03/30/2016 07:36 AM, Stephen Warren wrote:
>>>>>> On 03/28/2016 11:05 AM, Eric Nelson wrote:
>
> <snip>
>
>>>>
>>>> We could allocate the data storage for the block cache at the top of RAM
>>>> before relocation, like many other things are allocated, and hence not
>>>> use malloc() for that.
>>>
>>> Hmmm. We seem to have gone from a discussion about data structures to
>>> type of allocation.
>>>
>>> I'm interested in seeing how that works. Can you provide hints about
>>> what's doing this now?
>>
>> Something like common/board_f.c:reserve_mmu() and many other functions
>> there. relocaddr starts at approximately the top of RAM, continually
>> gets adjusted down as many static allocations are reserved, and
>> eventually becomes the address that U-Boot is relocated to. Simply
>> adding another entry into init_sequence_f[] for the disk cache might work.
>>
>
> Thanks for the pointer. I'll review that when time permits.
>
> This would remove the opportunity to re-configure the cache though, right?
Well, it would make it impossible to use less RAM. One could use more by
having a mix of the initial static allocation plus some additional
dynamic allocation, but that might get a bit painful to manage.
It might be interesting to use the MMU more and allow de-fragmentation
of VA space. That is, assuming there's much more VA space than RAM, such
as is true on current 64-bit architectures. Then I wouldn't dislike
dynamic allocation so much:-)
> I'm not sure whether how important this feature is, and I think
> only time and use will tell.
>
> I'd prefer to keep that ability at least for a cycle or two so that
> I and others can test.
>
>>>>> While re-working the code, I also thought more about using an array and
>>>>> still don't see how the implementation doesn't get more complex.
>>>>>
>>>>> The key bit is that the list is implemented in MRU order so
>>>>> invalidating the oldest is trivial.
>>>>
>>>> Yes, the MRU logic would make it more complex. Is that particularly
>>>> useful, i.e. is it an intrinsic part of the speedup?
>>>
>>> It's not a question of speed with small numbers of entries. The code
>>> to handle eviction would just be more complex.
>>
>> My thought was that if the eviction algorithm wasn't important (i.e.
>> most of the speedup comes from have some (any) kind of cache, but the
>> eviction algorithm makes little difference to the gain from having the
>> cache), we could just drop MRU completely. If that's not possible, then
>> indeed a list would make implementing MRU easier.
>>
>
> How would we decide which block to discard? I haven't traced enough
> to know what algorithm(s) might be best, but I can say that there's
> a preponderance of repeated accesses to the last-accessed block,
> especially in ext4.
Perhaps just keep an index into the array, use that index any time
something is written to the cache, and then increment it each time.
Probably not anywhere near as optimal as MRU/LRU though.
More information about the U-Boot
mailing list