ext4: invalid extent block on imx7

Jan Kiszka jan.kiszka at siemens.com
Wed Mar 25 21:17:39 CET 2020

On 25.03.20 21:01, Stephen Warren wrote:
> On 3/25/20 1:11 PM, Jan Kiszka wrote:
>> On 25.03.20 16:00, Tom Rini wrote:
>>> On Wed, Mar 25, 2020 at 07:32:30AM +0100, Jan Kiszka wrote:
>>>> On 20.03.20 19:21, Tom Rini wrote:
>>>>> On Mon, Mar 16, 2020 at 08:09:53PM +0100, Jan Kiszka wrote:
>>>>>> Hi all,
>>>>>> => ls mmc 0:1 /usr/lib/linux-image-4.9.11-1.3.0-dirty
>>>>>> CACHE: Misaligned operation at range [bdfff998, bdfffd98]
>>>>>> CACHE: Misaligned operation at range [bdfff998, bdfffd98]
>>>>>> CACHE: Misaligned operation at range [bdfff998, bdfffd98]
>>>>>> CACHE: Misaligned operation at range [bdfff998, bdfffd98]
>>>>>> invalid extent block
>>>>>> I'm using master (50be9f0e1ccc) on the MCIMX7SABRE, defconfig.
>>>>>> What could this be? The filesystem is fine from Linux POV.
>>>>> Use tune2fs -l and see if there's any new'ish features enabled that we
>>>>> need some sort of check-and-reject for would be my first guess.
>>>> Here are the reported feature flags:
>>>> has_journal ext_attr resize_inode dir_index filetype extent 64bit
>>>> flex_bg
>>>> sparse_super large_file huge_file dir_nlink extra_isize metadata_csum
>>> Of that, only metadata_csum means that you can't write to that image,
>>> but you're just trying to read and that should be fine.  Can you go back
>>> in time a little and see if this problem persists or if it's been
>>> introduced of late?  Or recreate it on other platforms/SoCs?  Thanks!
>> Bisected, regression of d5aee659f217 ("fs: ext4: cache extent data").
>> Reverting this commit over master resolves the issue.
>> Any idea what could be wrong? What I noticed is that the extent has a
>> zeroed magic when things go wrong, so maybe it is falsely considered to
>> be cached?
> This is puzzling. I took another look at that patch and I don't see
> anything wrong. My guess would be:
> - Some unrelated memory corruption bug was exposed simply because this
> patch uses dynamic memory or stack slightly differently than before.
> - Something writes to the cached block, whereas the cache code assumes
> the buffer is read-only.
> The cache metadata exists on the stack and so only lasts for the
> duration of read_allocated_block() or ext4fs_read_file(), so there's no
> issue with re-using the cache across different devices, or persisting
> across an ext4 write operation or anything like that. Is this easy to
> reproduce; is there a small disk image that shows the problem?

Found it: alignment issue, apparently surfaced by your change when 
switching from zalloc (which does cacheline? alignment) to malloc. Is 
this sensitivity maybe SoC specific?


Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

More information about the U-Boot mailing list