[U-Boot] Virtual addresses, u-boot, and the MMU

Wed Sep 2 00:01:21 CEST 2009

Wolfgang Denk wrote:
> Dear "J. William Campbell",
>
> In message <4A9D5EF2.4030307 at comcast.net> you wrote:
>   
>>       I have followed the recent discussions about problems in the CFI 
>> driver caused by the need to change the attributes of the address at 
>> which the flash is mapped. This discussion has raised some questions in 
>> my mind regarding the assumptions u-boot makes regarding the behavior of 
>> the addresses used in the code. I am writing this message for comments 
>> that may correct any mis-understandings I have. First, the addresses 
>> used by the u-boot program to reference memory and code are all 
>> "virtual" addresses (VA) because they go through the MMU  to become a 
>> physical address(PA). If there is no MMU, the virtual address and the 
>> physical address are identical.
>>     
>
> Even if there is a MMU, we keep it "switched off" on most systems, or
> otherwise set it up in such a way that there is still a 1:1  mapping,
> i. e. the virtual address and physical address are identical, too.
>
> There have been several discussions concerning this topic on IRC
> (#u-boot at freenode); I'll try to summarize these here - not sure
> though if I don't miss anything: please feel free to complement what
> might be missing.
>
>   
<snip>
> Becky then posted the summary of this discussion here:
>
> http://thread.gmane.org/gmane.comp.boot-loaders.u-boot/50705
>
>
> Note that there was a general agreement among those who raised their
> voices.
>
>   
In quick summary, for the next few years, we will require that all 
"important" physical addresses have corresponding virtual addresses. 
Some limited support for mapping in "other resources" may be provided at 
an operator interface level, but it will be quite limited. OK, seems 
reasonable to me.
>   
>>         The "normal", or legacy,policy for u-boot is to arrange a memory 
>> map such that all physical addresses are mapped to some virtual address. 
>> Originally, the mapping were such that the VA was actually == the PA, 
>> but today on some CPUs, this is not possible. When the size of the 
>> physical address space exceeds the size of the virtual address space, 
>> the VA may not =- the PA numerically, but there is a one-to-one 
>> correspondence. It MAY also acceptable to map the same PA so that it 
>> appears more than once in the address space (VA), but if this is done, 
>>     
>
> This may or may not be possible. It may even make sense or be needed
> on some systems, and it may be impossible to do on others.
>
> In any case, I think we should be careful not to mix things: what  we
> are  discussing here are address mappings. What we are not discussing
> is specific memory properties like  being  cached/uncached,  guarded/
> non-guarded, etc.
>
> Such properties are important, too, but  they  need  to  get  handled
> through a separate interface.
>   
Here is where I am quite sure you are going to have a problem. In very 
many CPUs, cache control and memory management are joined at the hip. 
Some systems have no easy way to enable and disable (D,I) cache 
globally, it is only doable on a page or segment basis. The PPC hardware 
has a relatively low cost way to do so, but not all architectures do.
>   
>>       Becky Bruce "re-wrote the driver to use VAs instead of PAs." I am 
>> not exactly sure what this means, but I assume it meant allowing the VA 
>> referencing the flash
>> to be distinct from the PA where the flash "lives" (and may require 36 
>> bits on a PPC system to represent the PA). Does the driver re-map 
>>     
>
> I think the information provided above sheds more light on this.
>   
Yes, it did.
>   
>> portions of the flash in order to access them? If the flash is really 
>> large, I can certainly see a need to do so. However, I assume on "medium 
>> size" flashes, it does not need to remap. In that case, don't all 
>> references just go through the MMU and get translated? The VA != PA, but 
>> from the point of view of u-boot, the VA is the only address that 
>> matters. The AVR32 certainly does not map flash dynamically, so it would 
>> not matter on that CPU.
>>     
>
> OK.
>
>   
>>       The issue with the CFI driver on the AVR32 is that it needs to 
>> disable cache on the address space of the flash memory when it is 
>> writing to the flash. This apparently is not trivial to do, but there is 
>>     
>
> Actually this is not specific to the AVR32, and so far  most  systems
> simply  do  not  enable  caches at all on the flash memory regions. I
> understand why the AVR32 solution is interesting, and I  think,  when
> we  try to find a solution for this we should use this chance to find
> a solution that also allows other systems to turn on  caches  on  the
> flash memory - things like loading the Linux kernel or ramdisk images
> etc. will benefit from that.
>   
Full ACK.

<snip>

> While there are specific routines to  "write"  to  the  flash  (init,
> erase,  write),  there  is  no  specific  code  to "read" from flash.
> Reading is allowed everywhere by just  performing  load  instructions
> from this memory area. The CFI driver (nor any other flash driver) is
> needed  or involved to do that. That's the whole big advantage of NOR
> flash (which makes it _memory_) over storage devices like  NAND  etc.
> (which are _not_ memory).
>
> You would have to add  this  macro  everywhere  -  on  anything  that
> accesses  memory.  All  commands  that  take  memory addresses either
> directly or indirectly, each and every load instruction.  That's  not
> practical.  [OK,  you  could  probably set up the MMU to trap on read
> accesses on such a memory reagion, but  that  would  not  exactly  be
> simpler either.]
>   
Very True. I did forget about the read being just a memory reference. So 
if we desire the flash to be cached, it would have to "normally" be 
cached for reads to take advantage of the operation.
<snip>
>
> The only thing you overlooked in my opinion is that read accesses to
> NOR flash are plain memory accesses that are not handled by the CFI
> or any other driver.
>   
Thanks for looking at this. It therefore seems to me that adding an 
"uncache(virtual address)" operation (that may return a substitute 
address for the actual write to the flash) followed by a 
"restore_cache()" operation inside the flash driver write routine should 
work. The uncache routine would do nothing if the flash is not cached to 
begin with, would globally turn off data cache if that is easy to do, or 
would provide an alternate virtual address to be used in the write. That 
alternate address would either be obtained from a statically mapped copy 
of the flash memory with cache disabled in that virtual address region, 
or a dynamic map added to the MMU that will cause references via the 
returned virtual address to the physical flash memory to be un-cached. 
Choose the approach that fits your hardware best. In any case, the D 
cache and/or I cache may need to be flushed on a write if the flash is 
mapped in cache anywhere. The restore_cache routine would do these 
operations if necessary, and also un-do whatever was done in the uncache 
routine.
    This way the flash can be mapped in to regular memory as cacheable, 
so we can get the speed advantages in normal operation. Writing to flash 
will require somewhat different tricks depending on the CPU.
>
> Best regards,
>
> Wolfgang Denk
>
>   
Best Regards,
Bill Campbell