[U-Boot] [RFC V2 PATCH 1/3] drivers: block: add block device cache
Eric Nelson
eric at nelint.com
Wed Mar 23 18:43:50 CET 2016
Hi Stephen,
Thanks again for the detailed review.
On 03/23/2016 10:22 AM, Stephen Warren wrote:
> On 03/20/2016 07:45 PM, Eric Nelson wrote:
>> Add a block device cache to speed up repeated reads of block devices by
>> various filesystems.
>>
>> This small amount of cache can dramatically speed up filesystem
>> operations by skipping repeated reads of common areas of a block
>> device (typically directory structures).
>>
>> This has shown to have some benefit on FAT filesystem operations of
>> loading a kernel and RAM disk, but more dramatic benefits on ext4
>> filesystems when the kernel and/or RAM disk are spread across
>> multiple extent header structures as described in commit fc0fc50.
>>
>> The cache is implemented through a minimal list (block_cache) maintained
>> in most-recently-used order and count of the current number of entries
>> (cache_count). It uses a maximum block count setting to prevent copies
>> of large block reads and an upper bound on the number of cached areas.
>>
>> The maximum number of entries in the cache defaults to 4 and the maximum
>> number of blocks per cache entry has a default of 1, which matches the
>> current implementation of at least FAT and ext4 that only read a single
>> block at a time.
>
> If the maximum size of the cache is fixed and small (which judging by
> the description it is), why not use an array rather than a linked list.
> That would be far simpler to manage.
>
You seem to agree with Marek, and I must be dain-bramaged because I
just don't see it.
>> The 'blkcache' command (enabled through CONFIG_CMD_BLOCK_CACHE) allows
>> changing these values and can be used to tune for a particular filesystem
>> layout.
>
> Even with this dynamic adjustment, using an array still feels simpler,
> although I have't looked at the code yet.
>
>> diff --git a/drivers/block/cache_block.c b/drivers/block/cache_block.c
>
>> +/*
>> + * search for a set of blocks in the cache
>> + *
>> + * remove and return the node if found so it can be
>> + * added back at the start of the list and maintain MRU order)
>> + */
>
> Splitting the remove/add feels a bit dangerous, since forgetting to
> re-do the add (or e.g. some error causing that to be skipped) could
> cause cache_count to become inconsistent.
>
> The function name "find" is a bit misleading. cache_find_and_remove()
> would make its semantics more obvious. Or, simply put the list_del()
> somewhere else; it doesn't feel like part of a "find" operation to me.
> Or, put the entire move-to-front operation inside this function so it
> isn't split up - if so, cache_find_and_move_to_head().
>
You're right. I'll look at that for a V3 (hopefully non-RFC)
patch set.
>> +static struct block_cache_node *cache_find
>> + (int iftype, int devnum,
>
> Nit: the ( and first n arguments shouldn't be wrapped.
>
>> +int cache_block_read(int iftype, int devnum,
>> + lbaint_t start, lbaint_t blkcnt,
>> + unsigned long blksz, void *buffer)
>
>> + memcpy(buffer, src, blksz*blkcnt);
>
> Nit: Space around operators. checkpatch should catch this.
>
Thanks.
>> + if (trace)
>> + printf("hit: start " LBAF ", count " LBAFU "\n",
>> + start, blkcnt);
>
> I guess I'll find that trace can be adjusted at run-time somewhere later
> in this patch, but it's more typical to use debug() without the if
> statement for this. It would be awesome if debug() could be adjusted at
> run-time...
>
I wanted to keep this as a run-time thing because it's useful in
tuning things and I'm not yet certain if/when users might need
to update the sizes.
I could wrap these with some #ifdef stuff but I'm not certain
which is the right thing to do.
>> +void cache_block_fill(int iftype, int devnum,
> ...
>> + if (node->cache == 0) {
>> + node->cache = malloc(bytes);
>> + if (!node->cache) {
>
> Nit: may as well be consistent with those NULL checks.
>
Yep.
>> +void cache_block_invalidate(int iftype, int devnum)
> ...
>> +void cache_block_invalidate_all(void)
>
> There's no invalidate_blocks(); I imagine that writing-to/erasing (some
> blocks of) a device must call cache_block_invalidate() which will wipe
> out even data that wasn't written.
>
Right. I figured that this wasn't worth extra code.
The 99.99% use case for U-Boot is read-only, so optimizing this
just seems like bloat.
>> +static void dump_block_cache(void)
> ...
>> + list_for_each_safe(entry, n, &block_cache) {
>
> Nit: This doesn't need to be _safe since the list is not being modified.
>
Good catch.
>> +static int do_blkcache(cmd_tbl_t *cmdtp, int flag,
>> + int argc, char * const argv[])
>> +{
>> + if ((argc == 1) || !strcmp("show", argv[1])) {
>> + printf("block cache:\n"
>> + "%u\thits\n"
>> + "%u\tmisses\n"
>> + "%u\tentries in cache\n"
>> + "trace %s\n"
>> + "max blocks/entry %lu\n"
>> + "max entries %lu\n",
>> + block_cache_hits, block_cache_misses, cache_count,
>> + trace ? "on" : "off",
>> + max_blocks_per_entry, max_cache_entries);
>
> Nit: Let's print the value and "label" in a consistent order. I would
> suggest everything being formatted as:
>
> " description: %u"
>
> so that it's indented, has a delimiter between description and value,
> and so that irrespective of the length of the converted value, the
> indentation of the other parts of the string don't change (\t doesn't
> guarantee this after a certain number length).
>
Thanks.
> I would rather have expected "show" to dump the entries too, but I
> suppose it's fine that they're separate.
>
While testing, I found myself mostly using 'show' to see results.
I added dump as a debug tool and almost removed it, but figured this
was still an RFC, so I left it in.
>> + } else if ((argc == 2) && ('c' == argv[1][0])) {
>
> Nit: the value of 'c' is always 'c', so it's pointless to test whether
> it's equal to something. The value being compared is arv[1][0], so the
> parameters to == should be swapped. I'm aware that == is mathematically
> commutative, but typical English reading of the operator is "is equal
> to" which has non-commutative implications re: what is being tested. I'm
> also aware that this construct is used to trigger compiler warnings if
> someone types = instead of ==. However, (a) you didn't and (b) compilers
> warn about this these days, so there's no need to write unusual code to
> get that feature anymore.
>
:) muscle-memory again.
It's usually Marek that catches my yoda-expressions, which are usually
tests against zero.
> I worry that being imprecise in command-line parsing (i.e. treating both
> "crud" and "clear" as the same thing) will lead to problems expanding
> the command-line format in the future; we won't be able to ever add
> another option starting with "c" and maintain the same abbreviation
> semantics. I'd suggest requiring the full option name always.
>
>> + } else if ((argc == 3) && isdigit(argv[1][0]) &&
>> isdigit(argv[2][0])) {
>
> Let's make this "blkcache set" or "blkcache configure" so that other
> sub-commands can take numeric parameters in the future, and have
> consistent command-line syntax.
>
Sounds good to me.
>> +U_BOOT_CMD(
>> + blkcache, 3, 0, do_blkcache,
>> + "block cache control/statistics",
>> + "[show] - show statistics\n"
>> + "blkcache c[lear] - clear statistics\n"
>> + "blkcache d[ump] - dump cache entries\n"
>> + "blkcache i[nvalidate] - invalidate cache\n"
>> + "blkcache #maxblocks #maxentries\n"
>> + "\tset maximum blocks per cache entry, maximum entries\n"
>> + "blkcache t[race] [off] - enable (disable) tracing"
>> +);
>
> BTW, isn't there some support in U-Boot for sub-commands already, so you
> can write a separate function per sub-command and skip writing all the
> strcmp logic to select between them? I'm pretty sure I saw that somewhere.
>
There is, but I haven't (yet) used it. I'll take a look at this in V3.
>> diff --git a/include/part.h b/include/part.h
>
>> +static inline void cache_block_invalidate_all(void){}
>
> Is that useful outside of the debug commands?
>
Not at the moment, and it probably won't be needed when I figure
out how this glues to the block layer and/or drivers in a cleaner
way.
Regards,
Eric
More information about the U-Boot
mailing list