[U-Boot] UBIFS mount bug when mounting from multiple MTD partitions

Heiko Schocher hs at denx.de
Wed Apr 10 13:59:44 UTC 2019


Hello Frieder,

Am 10.04.2019 um 15:31 schrieb Schrempf Frieder:
> Hi Heiko,
> 
> On 10.04.19 14:44, Heiko Schocher wrote:
>> Hello Frieder,
>>
>> Am 10.04.2019 um 12:49 schrieb Schrempf Frieder:
>>> Hi,
>>>
>>> I have a customer who has a NAND device with two MTD partitions and
>>> each > of the partitions contains one UBI volume with a UBIFS filesystem.
>>
>> Bad idea ... why?
>>
>> You may loose lifetime of the board, as UBI cannot use PEBs between the
>> 2 MTD partitions on the nand ... better you would have one big MTD
>> Partition
>> with n ubi volumes in it ...
>>
>> But ... okay... this must work also.
> 
> Yeah, I only recently learned about the disadvantages of this setup.
> Maybe we can change it in the future, but for now they are using
> separate partitions.

Ok.

>>> Now U-Boot can mount the UBIFS from the first partition just fine, but
>>> if the UBIFS from the second partition is mounted afterwards this fails
>>> in some cases.
>>
>> :-(
>>
>>> I can reproduce the error and tracked it down to uboot_ubifs_mount() in
>>> fs/ubifs/super.c. If this function is run for the second mount, the
>>> struct ubifs_fs_type is reused and it contains a list fs_supers, that
>>> still holds one entry for the first mount.
>>
>> Sure?
>>
>> fs_supers in struct file_system_type seems used only in none
>> U-Boot code...
> 
> Right, I had a closer look and fs_supers seems to be unused indeed, but
> somehow it causes corruption in my case. When I apply 5a08cfee3967
> (ubifs: remove useless code) the problem disappears.

Ah, 5a08cfee3967 is in mainline ... good ;-)

> Without this patch there still is hlist_add_head(&s->s_instances,
> &type->fs_supers) and this line somehow seems to cause the error.
> 
> I applied this debug diff:
> 
> --- a/fs/ubifs/super.c
> +++ b/fs/ubifs/super.c
> @@ -2428,7 +2428,15 @@ retry:
>    #else
>           strncpy(s->s_id, type->name, sizeof(s->s_id));
>    #endif
> +       printf("%s:%d: ubi_num: %d, vol_id: %d\n",
> +              __func__, __LINE__,
> +              ((struct ubifs_info *)s->s_fs_info)->vi.ubi_num,
> +              ((struct ubifs_info *)s->s_fs_info)->vi.vol_id);
>           hlist_add_head(&s->s_instances, &type->fs_supers);
> +       printf("%s:%d: ubi_num: %d, vol_id: %d\n",
> +              __func__, __LINE__,
> +              ((struct ubifs_info *)s->s_fs_info)->vi.ubi_num,
> +              ((struct ubifs_info *)s->s_fs_info)->vi.vol_id);
>    #ifndef __UBOOT__
>           spin_unlock(&sb_lock);
>           get_filesystem(type);
> 
> And I'm getting this for the first mount:
> 
> sget:2431: ubi_num: 0, vol_id: 0
> sget:2433: ubi_num: 0, vol_id: 0
> 
> And this for the second mount:
> 
> sget:2431: ubi_num: 0, vol_id: 0
> sget:2433: ubi_num: -1678121552, vol_id: -1678120656

Hmm...

>>> I guess, that if the second mount would happen on a volume that is on
>>> the same MTD partition as the first volume, than this will work. The
>>> second entry is added to ubifs_fs_type.fs_supers.
>>
>> I cannot see this from looking into code ... so hard to say, but
>> I only looked into mainline code ...
> 
> Yeah, I was probably wrong with these first wild guesses...

;-)

>>> In my case however, the second entry being added to
>>> ubifs_fs_type.fs_supers is invalid and causes the mount error.
>>>
>>> Reinitializing the list in uboot_ubifs_mount() before each mount, solves
>>> the problem, but I guess that it will cause failures in other setups,
>>> where there are actually multiple volumes on one MTD device.
>>>
>>> So how can I solve this properly? Do we need one instance of struct
>>> ubifs_fs_type for each MTD device?
>>
>> Hmm.. without digging into it, it is difficult to say...
>>
>>> I tested this on an old version (2017.03), but looking at the current
>>> code, it looks like the same problem applies to current mainline.
>>
>> Is there any chance to try it with current mainline ?
> 
> The problem is a bit strange and this what I'm actually worried about.
> It is persistent in a certain environment: U-Boot loaded from SPI NOR,
> environment set to certain values, data written to UBIFS partition in
> Linux and then power-cut.
> 
> If one of these conditions changes, the error usually disappears, for
> example if I use the exact same setup, but load the Bootloader from MMC
> or RAM. Or if no write access with power-cut happens.
> 
> So I wonder if there's some memory corruption somewhere else. Though,
> the error happens always at the same place. Debug prints or other code
> changes have no influence.
> 
> I really would like to understand what's going on so I can make sure
> that 5a08cfee3967 actually solves the real issue or just hides it.

Please keep me informed ... I try to reproduce it here on a hw I
have access to, but need some "free" time ...

bye,
Heiko
-- 
DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: +49-8142-66989-52   Fax: +49-8142-66989-80   Email: hs at denx.de


More information about the U-Boot mailing list