[U-Boot] [PATCH] blk: Invalidate block cache when switching hwpart
Weijie Gao
weijie.gao at mediatek.com
Tue Aug 27 03:07:04 UTC 2019
On Mon, 2019-08-26 at 14:43 +0200, Felix Brack wrote:
> Hello Weijie,
>
> On 26.08.19 10:19, Weijie Gao wrote:
> > On Thu, 2019-08-22 at 15:58 +0200, Felix Brack wrote:
> >> On 11.07.19 09:10, Weijie Gao wrote:
> >>
> >>> Some storage devices have multiple hw partitions and both address from
> >>> zero, for example eMMC.
> >>> However currently block cache invalidation only applies to block
> >>> write/erase.
> >>> This can cause a problem that data of current hw partition is cached
> >>> before switching to another hw partition. And the following read
> >>> operation of the latter hw partition will get wrong data when reading
> >>> from the addresses that have been cached previously.
> >>>
> >>> To solve this problem, invalidate block cache after a successful
> >>> select_hwpart operation.
> >>>
> >>> Signed-off-by: Weijie Gao <weijie.gao at mediatek.com>
> >>> ---
> >> This patch breaks correct operation of PDU001 board.
> >>
> >>> drivers/block/blk-uclass.c | 14 ++++++++++++--
> >>> 1 file changed, 12 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/block/blk-uclass.c b/drivers/block/blk-uclass.c
> >>> index baaf431e5e0..c23b6682a6c 100644
> >>> --- a/drivers/block/blk-uclass.c
> >>> +++ b/drivers/block/blk-uclass.c
> >>> @@ -208,7 +208,11 @@ int blk_select_hwpart_devnum(enum if_type if_type, int devnum, int hwpart)
> >>> if (ret)
> >>> return ret;
> >>>
> >>> - return blk_select_hwpart(dev, hwpart);
> >>> + ret = blk_select_hwpart(dev, hwpart);
> >>> + if (!ret)
> >>> + blkcache_invalidate(if_type, devnum);
> >>> +
> >>> + return ret;
> >>> }
> >>>
> >>> int blk_list_part(enum if_type if_type)
> >>> @@ -348,7 +352,13 @@ int blk_select_hwpart(struct udevice *dev, int hwpart)
> >>>
> >>> int blk_dselect_hwpart(struct blk_desc *desc, int hwpart)
> >>> {
> >>> - return blk_select_hwpart(desc->bdev, hwpart);
> >>> + int ret;
> >>> +
> >>> + ret = blk_select_hwpart(desc->bdev, hwpart);
> >>> + if (!ret)
> >>> + blkcache_invalidate(desc->if_type, desc->devnum);
> >>> +
> >> Commenting the invalidation of the block cache on this line results in
> >> proper working of the board.
> >>
> >>> + return ret;
> >>> }
> >>>
> >>> int blk_first_device(int if_type, struct udevice **devp)
> >>>
> >> With the patch active, files from the boot FAT partition of the SD-card
> >> (mmc device 1) do not load anymore, hence booting fails.
> >> Using the eMMC (mmc device 0) instead works fine, files can bee loaded
> >> and the board boots.
> >>
> >> To isolate the problem I have modified the configuration to only load a
> >> 10k test file (test.bin) instead of loading the DTB and zImage files
> >> followed by booting LINUX. Furthermore I have enabled debugging for some
> >> parts of the code.
> >>
> >> When I boot from the SD-card (which fails) I get the following logged:
> >> ===
> >> CPU : AM335X-GP rev 1.0
> >> ===
> >>
> >> This time the file test.bin gets loaded correctly.
> >>
> >> To be honest I don't really see why things are going wrong. It looks
> >> like the block cache gets filled but then again invalidated before it's
> >> data is used. I also feel that this problem is related to the SD-card
> >> not being the first (number 0) but the second device (number 1).
> >> Can anybody shed some light on this and give me a hint?
> >>
> >> many thanks, Felix
> >
> > Hi Felix,
> >
> > I've found the root cause.
> >
> Many thanks for looking into this!
>
> > There is a bug in the FAT driver. In function file_fat_read_at(),
> > malloc_cache_aligned() allocates memory for the itr. The itr is a
> > pointer to struct fat_itr. fat_itr has a member block[MAX_CLUSTSIZE]
> > used for storing blocks read from MMC.
> >
> > The FAT driver resolves the file table and stores the required dir entry
> > in the member block. Then the pointer of the dir entry is passed to
> > get_contents() to get the contents of the file.
> >
> > However the itr is freed BEFORE the call to get_contents(). This means
> > the contents dir entry points to is no longer valid. And it results to
> > your situation.
> >
> Indeed this is the root of all trouble. The freeing of itr of course
> also invalidates the assignment made earlier:
>
> dir_entry *dentptr = itr->dent;
>
> As dentptr is used in the call to get_contents() the evil takes its
> course...
>
> Again many thanks for analyzing and explaining this as detailed as you did.
>
> > But this only happens when CONFIG_BLOCK_CACHE=y and the patch applied.
> > I've also found its cause.
> >
> > The MMC driver calls blk_dselect_hwpart() unconditionally in
> > mmc_bread(). This causes block cache being invalidated and refilled many
> > times during the FAT load operation.
> >
> Yes I have noticed this frequent invalidations of the cache due to calls
> to blk_dselect_hwpart(). Some optimization, if possible, would be great.
>
> > The frequent block cache invalidation and refilling finally results in
> > frequent memory allocation and freeing. There is a mechanism in the
> > dlmalloc.c: if the total free space reaches a threshold, the
> > malloc_trim() will be called. malloc_trim() calls sbrk(), which
> > fills the memory being freed with zeros and of course the data of itr is
> > damaged, otherwise the original data remains unchanged.
> >
> > So this is the reason the bug only appears when block cache is enabled
> > and cache invalidation is called after every select_hwpart.
> >
> I see and now understand the different behavior of eMMC and SD-card.
>
> > I have made a patch, could you please have a try? I've tested it worked.
> > If it works for you too, I will submit it to u-boot. And I will submit
> > another patch to optimize the cache invalidation for select_hwpart.
> >
> > --- a/fs/fat/fat.c
> > +++ b/fs/fat/fat.c
> > @@ -1174,10 +1174,6 @@ int file_fat_read_at(const char *filename, loff_t
> > pos, void *buffer,
> > /* For saving default max clustersize memory allocated to malloc pool
> > */
> > dir_entry *dentptr = itr->dent;
> >
> > - free(itr);
> > -
> > - itr = NULL;
> > -
> > ret = get_contents(&fsdata, dentptr, pos, buffer, maxsize, actread);
> >
> > out_free_both:
> >
> I can confirm that your patch works perfect, thanks!
>
> Please CC me when submitting the two patches you mentioned above. It
> will be my pleasure to test them on my hardware.
>
> >
> > BTW, I reproduced this failure on a MT7623 board, which also has an eMMC
> > and a SD. And this patch is indeed used for MT7623 to solve the bug on
> > switching hwpart on eMMC.
> >
> >
> > Best Regards,
> >
> > Weijie
> >
>
> regard, Felix
Hi Felix,
I found there's already a commit merged 6 hours ago (mailed on Aug 20)
that fixed this issue: d7af2a863017be1f5fd1b65a858ddc7e87d7b876
So there's no need to send patch again. What you need is to pull the
latest master branch.
I will still send a patch to optimize the invalidation in select_hwpart.
Best Regards,
Weijie
More information about the U-Boot
mailing list