[U-Boot] [PATCH] blk: Invalidate block cache when switching hwpart

Weijie Gao weijie.gao at mediatek.com
Mon Aug 26 08:19:50 UTC 2019


On Thu, 2019-08-22 at 15:58 +0200, Felix Brack wrote:
> On 11.07.19 09:10, Weijie Gao wrote:
> 
> > Some storage devices have multiple hw partitions and both address from
> > zero, for example eMMC.
> > However currently block cache invalidation only applies to block
> > write/erase.
> > This can cause a problem that data of current hw partition is cached
> > before switching to another hw partition. And the following read
> > operation of the latter hw partition will get wrong data when reading
> > from the addresses that have been cached previously.
> > 
> > To solve this problem, invalidate block cache after a successful
> > select_hwpart operation.
> > 
> > Signed-off-by: Weijie Gao <weijie.gao at mediatek.com>
> > ---
> This patch breaks correct operation of PDU001 board.
> 
> >  drivers/block/blk-uclass.c | 14 ++++++++++++--
> >  1 file changed, 12 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/block/blk-uclass.c b/drivers/block/blk-uclass.c
> > index baaf431e5e0..c23b6682a6c 100644
> > --- a/drivers/block/blk-uclass.c
> > +++ b/drivers/block/blk-uclass.c
> > @@ -208,7 +208,11 @@ int blk_select_hwpart_devnum(enum if_type if_type, int devnum, int hwpart)
> >  	if (ret)
> >  		return ret;
> >  
> > -	return blk_select_hwpart(dev, hwpart);
> > +	ret = blk_select_hwpart(dev, hwpart);
> > +	if (!ret)
> > +		blkcache_invalidate(if_type, devnum);
> > +
> > +	return ret;
> >  }
> >  
> >  int blk_list_part(enum if_type if_type)
> > @@ -348,7 +352,13 @@ int blk_select_hwpart(struct udevice *dev, int hwpart)
> >  
> >  int blk_dselect_hwpart(struct blk_desc *desc, int hwpart)
> >  {
> > -	return blk_select_hwpart(desc->bdev, hwpart);
> > +	int ret;
> > +
> > +	ret = blk_select_hwpart(desc->bdev, hwpart);
> > +	if (!ret)
> > +		blkcache_invalidate(desc->if_type, desc->devnum);
> > +
> Commenting the invalidation of the block cache on this line results in
> proper working of the board.
> 
> > +	return ret;
> >  }
> >  
> >  int blk_first_device(int if_type, struct udevice **devp)
> > 
> With the patch active, files from the boot FAT partition of the SD-card
> (mmc device 1) do not load anymore, hence booting fails.
> Using the eMMC (mmc device 0) instead works fine, files can bee loaded
> and the board boots.
> 
> To isolate the problem I have modified the configuration to only load a
> 10k test file (test.bin) instead of loading the DTB and zImage files
> followed by booting LINUX. Furthermore I have enabled debugging for some
> parts of the code.
> 
> When I boot from the SD-card (which fails) I get the following logged:
> ===
> CPU  : AM335X-GP rev 1.0
> ===
> 
> This time the file test.bin gets loaded correctly.
> 
> To be honest I don't really see why things are going wrong. It looks
> like the block cache gets filled but then again invalidated before it's
> data is used. I also feel that this problem is related to the SD-card
> not being the first (number 0) but the second device (number 1).
> Can anybody shed some light on this and give me a hint?
> 
> many thanks, Felix

Hi Felix,

I've found the root cause.

There is a bug in the FAT driver. In function file_fat_read_at(),
malloc_cache_aligned() allocates memory for the itr. The itr is a
pointer to struct fat_itr. fat_itr has a member block[MAX_CLUSTSIZE]
used for storing blocks read from MMC.

The FAT driver resolves the file table and stores the required dir entry
in the member block. Then the pointer of the dir entry is passed to
get_contents() to get the contents of the file.

However the itr is freed BEFORE the call to get_contents(). This means
the contents dir entry points to is no longer valid. And it results to
your situation.

But this only happens when CONFIG_BLOCK_CACHE=y and the patch applied.
I've also found its cause.

The MMC driver calls blk_dselect_hwpart() unconditionally in
mmc_bread(). This causes block cache being invalidated and refilled many
times during the FAT load operation.

The frequent block cache invalidation and refilling finally results in
frequent memory allocation and freeing. There is a mechanism in the
dlmalloc.c: if the total free space reaches a threshold, the
malloc_trim() will be called. malloc_trim() calls sbrk(), which
fills the memory being freed with zeros and of course the data of itr is
damaged, otherwise the original data remains unchanged.

So this is the reason the bug only appears when block cache is enabled
and cache invalidation is called after every select_hwpart.

I have made a patch, could you please have a try? I've tested it worked.
If it works for you too, I will submit it to u-boot. And I will submit
another patch to optimize the cache invalidation for select_hwpart.

--- a/fs/fat/fat.c
+++ b/fs/fat/fat.c
@@ -1174,10 +1174,6 @@ int file_fat_read_at(const char *filename, loff_t
pos, void *buffer,
 	/* For saving default max clustersize memory allocated to malloc pool
*/
 	dir_entry *dentptr = itr->dent;
 
-	free(itr);
-
-	itr = NULL;
-
 	ret = get_contents(&fsdata, dentptr, pos, buffer, maxsize, actread);
 
 out_free_both:


BTW, I reproduced this failure on a MT7623 board, which also has an eMMC
and a SD. And this patch is indeed used for MT7623 to solve the bug on
switching hwpart on eMMC.


Best Regards,

Weijie



More information about the U-Boot mailing list