[PATCH v2] armv8: mmu: support unmapping regions with set_one_region()
Casey Connolly
casey.connolly at linaro.org
Thu May 7 21:55:33 CEST 2026
On 07/05/2026 07:36, Sumit Garg wrote:
> On Wed, May 06, 2026 at 10:13:24AM +0300, Ilias Apalodimas wrote:
>> Hi Casey,
>>
>> On Mon, 4 May 2026 at 21:37, Casey Connolly <casey.connolly at linaro.org> wrote:
>>>
>>> Currently set_one_region() implicitly assumes that we want to map a
>>> region and aggressively splits blocks into tables to do this, but when
>>> called with PTE_TYPE_FAULT to unmap a currently mapped region it may
>>> try to unnecessarily split blocks which doesn't make sense if the entire
>>> block should actually be unmapped. In the end it then has to walk every
>>> single page and create a bunch of empty tables.
>>>
>>> Introduce a check for this kind of behaviour and optimise with a fast
>>> path, if we're unmapping a region >= the size of this entry then we can
>>> just unmap the entire PTE and whatever it contains.
>>>
>>> This fixes some bogus empty tables being left behind when carving out
>>> reserved memory regions on Qualcomm, and should improve the performance
>>> of the break-before-make in mmu_change_region_attr().
>>>
>>> Signed-off-by: Casey Connolly <casey.connolly at linaro.org>
>>> ---
>>> Changes in v2:
>>> - Update pte_type() to correctly check the PTE_TYPE_VALID bit
>>> - Explicitly unset both bits of PTE_TYPE_MASK to be extra safe
>>> - V1: https://lore.kernel.org/u-boot/20260504175511.585797-1-casey.connolly@linaro.org/
>>> ---
>>
>> [...]
>>
>>>
>>> + /*
>>> + * If we're trying to unmap a region then check if it's already unmapped or if it's bigger
>>> + * then the PTE we're looking at right now, in the first case we can do nothing and in the
>>> + * second case we just need to unmap this page/block.
>>> + * Otherwise we will needlessly create new tables until we have traversed every single page
>>> + * in the region.
>>> + */
>>> + if (attrs == PTE_TYPE_FAULT && (pte_type(pte) == PTE_TYPE_FAULT || size >= levelsize)) {
>>
>> You need an alignment check (with is_aligned) here instead of just the
>> size and levelsize. But then you have to measure if that makes a
>> meningful difference on the time it takes to run.
>>
>> For example imagine you have two 2mb block mapping and the user
>> requests to unmap 2mb starting at 1mb offset. Something like that:
>> Mapped memory:
>> 0x0-0x200000
>> 0x0000000-0x4000000
>>
>> and the user reqeusts to unmap 0x100000-0x300000
>> The size >= levelsize will succed in this case, but if I am reading
>> the code right you'll end up unmapping 0-0x200000
>
> I think this is the reason why we need to ensure that unmap calls from
> Qcom platform code should all be block mapping aligned (2MB). That will
> ensure we don't run into creating 4K page tables just for the sake of
> un-mapping reserved memory regions.
That wasn't what the code does in the qcom side, I did previously look
into this but it's quite complicated to do without the risk of treading
on a region we actually do need, since we're dcache-off at that point it
didn't seem worth it to add that complexity.
Anyhow, I have an improved version which unfortunately is slower but is
at least actually correct XD on sm8650 it now takes ~200ms. I think this
is the best we can hope for until U-Boot gets proper support for dynamic
memory mapping and we can get dcache enabled much earlier.
>
> -Sumit
--
// Casey (she/her)
More information about the U-Boot
mailing list