[U-Boot] [RFC V2 PATCH 1/3] drivers: block: add block device cache

Eric Nelson eric at nelint.com
Wed Mar 23 18:43:50 CET 2016


Hi Stephen,

Thanks again for the detailed review.

On 03/23/2016 10:22 AM, Stephen Warren wrote:
> On 03/20/2016 07:45 PM, Eric Nelson wrote:
>> Add a block device cache to speed up repeated reads of block devices by
>> various filesystems.
>>
>> This small amount of cache can dramatically speed up filesystem
>> operations by skipping repeated reads of common areas of a block
>> device (typically directory structures).
>>
>> This has shown to have some benefit on FAT filesystem operations of
>> loading a kernel and RAM disk, but more dramatic benefits on ext4
>> filesystems when the kernel and/or RAM disk are spread across
>> multiple extent header structures as described in commit fc0fc50.
>>
>> The cache is implemented through a minimal list (block_cache) maintained
>> in most-recently-used order and count of the current number of entries
>> (cache_count). It uses a maximum block count setting to prevent copies
>> of large block reads and an upper bound on the number of cached areas.
>>
>> The maximum number of entries in the cache defaults to 4 and the maximum
>> number of blocks per cache entry has a default of 1, which matches the
>> current implementation of at least FAT and ext4 that only read a single
>> block at a time.
> 
> If the maximum size of the cache is fixed and small (which judging by
> the description it is), why not use an array rather than a linked list.
> That would be far simpler to manage.
> 

You seem to agree with Marek, and I must be dain-bramaged because I
just don't see it.

>> The 'blkcache' command (enabled through CONFIG_CMD_BLOCK_CACHE) allows
>> changing these values and can be used to tune for a particular filesystem
>> layout.
> 
> Even with this dynamic adjustment, using an array still feels simpler,
> although I have't looked at the code yet.
> 
>> diff --git a/drivers/block/cache_block.c b/drivers/block/cache_block.c
> 
>> +/*
>> + * search for a set of blocks in the cache
>> + *
>> + * remove and return the node if found so it can be
>> + * added back at the start of the list and maintain MRU order)
>> + */
> 
> Splitting the remove/add feels a bit dangerous, since forgetting to
> re-do the add (or e.g. some error causing that to be skipped) could
> cause cache_count to become inconsistent.
> 
> The function name "find" is a bit misleading. cache_find_and_remove()
> would make its semantics more obvious. Or, simply put the list_del()
> somewhere else; it doesn't feel like part of a "find" operation to me.
> Or, put the entire move-to-front operation inside this function so it
> isn't split up - if so, cache_find_and_move_to_head().
>

You're right. I'll look at that for a V3 (hopefully non-RFC)
patch set.

>> +static struct block_cache_node *cache_find
>> +    (int iftype, int devnum,
> 
> Nit: the ( and first n arguments shouldn't be wrapped.
> 
>> +int cache_block_read(int iftype, int devnum,
>> +             lbaint_t start, lbaint_t blkcnt,
>> +             unsigned long blksz, void *buffer)
> 
>> +        memcpy(buffer, src, blksz*blkcnt);
> 
> Nit: Space around operators. checkpatch should catch this.
> 

Thanks.

>> +        if (trace)
>> +            printf("hit: start " LBAF ", count " LBAFU "\n",
>> +                   start, blkcnt);
> 
> I guess I'll find that trace can be adjusted at run-time somewhere later
> in this patch, but it's more typical to use debug() without the if
> statement for this. It would be awesome if debug() could be adjusted at
> run-time...
> 

I wanted to keep this as a run-time thing because it's useful in
tuning things and I'm not yet certain if/when users might need
to update the sizes.

I could wrap these with some #ifdef stuff but I'm not certain
which is the right thing to do.

>> +void cache_block_fill(int iftype, int devnum,
> ...
>> +    if (node->cache == 0) {
>> +        node->cache = malloc(bytes);
>> +        if (!node->cache) {
> 
> Nit: may as well be consistent with those NULL checks.
> 

Yep.

>> +void cache_block_invalidate(int iftype, int devnum)
> ...
>> +void cache_block_invalidate_all(void)
> 
> There's no invalidate_blocks(); I imagine that writing-to/erasing (some
> blocks of) a device must call cache_block_invalidate() which will wipe
> out even data that wasn't written.
> 

Right. I figured that this wasn't worth extra code.

The 99.99% use case for U-Boot is read-only, so optimizing this
just seems like bloat.

>> +static void dump_block_cache(void)
> ...
>> +    list_for_each_safe(entry, n, &block_cache) {
> 
> Nit: This doesn't need to be _safe since the list is not being modified.
> 

Good catch.

>> +static int do_blkcache(cmd_tbl_t *cmdtp, int flag,
>> +               int argc, char * const argv[])
>> +{
>> +    if ((argc == 1) || !strcmp("show", argv[1])) {
>> +        printf("block cache:\n"
>> +               "%u\thits\n"
>> +               "%u\tmisses\n"
>> +               "%u\tentries in cache\n"
>> +               "trace %s\n"
>> +               "max blocks/entry %lu\n"
>> +               "max entries %lu\n",
>> +               block_cache_hits, block_cache_misses, cache_count,
>> +               trace ? "on" : "off",
>> +               max_blocks_per_entry, max_cache_entries);
> 
> Nit: Let's print the value and "label" in a consistent order. I would
> suggest everything being formatted as:
> 
> "    description: %u"
> 
> so that it's indented, has a delimiter between description and value,
> and so that irrespective of the length of the converted value, the
> indentation of the other parts of the string don't change (\t doesn't
> guarantee this after a certain number length).
> 

Thanks.

> I would rather have expected "show" to dump the entries too, but I
> suppose it's fine that they're separate.
> 

While testing, I found myself mostly using 'show' to see results.

I added dump as a debug tool and almost removed it, but figured this
was still an RFC, so I left it in.

>> +    } else if ((argc == 2) && ('c' == argv[1][0])) {
> 
> Nit: the value of 'c' is always 'c', so it's pointless to test whether
> it's equal to something. The value being compared is arv[1][0], so the
> parameters to == should be swapped. I'm aware that == is mathematically
> commutative, but typical English reading of the operator is "is equal
> to" which has non-commutative implications re: what is being tested. I'm
> also aware that this construct is used to trigger compiler warnings if
> someone types = instead of ==. However, (a) you didn't and (b) compilers
> warn about this these days, so there's no need to write unusual code to
> get that feature anymore.
> 

:) muscle-memory again.

It's usually Marek that catches my yoda-expressions, which are usually
tests against zero.

> I worry that being imprecise in command-line parsing (i.e. treating both
> "crud" and "clear" as the same thing) will lead to problems expanding
> the command-line format in the future; we won't be able to ever add
> another option starting with "c" and maintain the same abbreviation
> semantics. I'd suggest requiring the full option name always.
> 
>> +    } else if ((argc == 3) && isdigit(argv[1][0]) &&
>> isdigit(argv[2][0])) {
> 
> Let's make this "blkcache set" or "blkcache configure" so that other
> sub-commands can take numeric parameters in the future, and have
> consistent command-line syntax.
> 

Sounds good to me.

>> +U_BOOT_CMD(
>> +    blkcache,    3,    0,    do_blkcache,
>> +    "block cache control/statistics",
>> +    "[show] - show statistics\n"
>> +    "blkcache c[lear] - clear statistics\n"
>> +    "blkcache d[ump] - dump cache entries\n"
>> +    "blkcache i[nvalidate] - invalidate cache\n"
>> +    "blkcache #maxblocks #maxentries\n"
>> +    "\tset maximum blocks per cache entry, maximum entries\n"
>> +    "blkcache t[race] [off] - enable (disable) tracing"
>> +);
> 
> BTW, isn't there some support in U-Boot for sub-commands already, so you
> can write a separate function per sub-command and skip writing all the
> strcmp logic to select between them? I'm pretty sure I saw that somewhere.
> 

There is, but I haven't (yet) used it. I'll take a look at this in V3.

>> diff --git a/include/part.h b/include/part.h
> 
>> +static inline void cache_block_invalidate_all(void){}
> 
> Is that useful outside of the debug commands?
> 

Not at the moment, and it probably won't be needed when I figure
out how this glues to the block layer and/or drivers in a cleaner
way.

Regards,


Eric


More information about the U-Boot mailing list