[U-Boot] Virtual addresses, u-boot, and the MMU

Tue Sep 1 21:21:23 CEST 2009

Dear "J. William Campbell",

In message <4A9D5EF2.4030307 at comcast.net> you wrote:
>       I have followed the recent discussions about problems in the CFI 
> driver caused by the need to change the attributes of the address at 
> which the flash is mapped. This discussion has raised some questions in 
> my mind regarding the assumptions u-boot makes regarding the behavior of 
> the addresses used in the code. I am writing this message for comments 
> that may correct any mis-understandings I have. First, the addresses 
> used by the u-boot program to reference memory and code are all 
> "virtual" addresses (VA) because they go through the MMU  to become a 
> physical address(PA). If there is no MMU, the virtual address and the 
> physical address are identical.

Even if there is a MMU, we keep it "switched off" on most systems, or
otherwise set it up in such a way that there is still a 1:1  mapping,
i. e. the virtual address and physical address are identical, too.

There have been several discussions concerning this topic on IRC
(#u-boot at freenode); I'll try to summarize these here - not sure
though if I don't miss anything: please feel free to complement what
might be missing.

[Times are MET/MEST]

Nov 20, 2008:

[19:40:03] galak: do we think of nand->IO_ADDR_R as a physical or virtual addr?
[19:40:10] scottwood: Wouldn't it need to change on Linux too?
[19:40:42] galak: don't know do we does drivers/mtd/nand/fsl_elbc_nand.c look the same in linux?
[19:40:48] scottwood: IO_ADDR_R is virtual
[19:41:01] scottwood: So it needs to change with virt != phys
[19:41:51] galak: ok, so this loop in board_nand_init():
[19:41:51] galak:         for (priv->bank = 0; priv->bank < MAX_BANKS; priv->bank++) {
[19:41:51] galak:                 br = in_be32(&elbc_ctrl->regs->bank[priv->bank].br);
[19:41:51] galak:                 or = in_be32(&elbc_ctrl->regs->bank[priv->bank].or);
[19:41:51] galak:                 
[19:41:52] galak:                 if ((br & BR_V) && (br & BR_MSEL) == BR_MS_FCM &&
[19:41:54] galak:                     (br & or & BR_BA) == nand->IO_ADDR_R)
[19:41:56] galak:                         break;
[19:41:58] galak:         }
[19:42:24] galak: use comparing a virt (IO_ADDR_R) with physical (br & or & BR_BA)
[19:43:25] scottwood: Right, it assumess identity mapping.  We'll need some way of finding the physical address now.
[19:45:06] scottwood: Does eLBC just ignore the upper 4 bits of the physical address, and rely on LAWs for that?
[19:45:16] galak: yep
[19:45:31] galak: which makes this slightly more annoying
[19:47:21] galak: we could have board code implement a function that deals with the mapping (given IO_ADDR_R, it hands bank the bank #)
[19:47:36] scottwood: I'd rather not make it board-specific.
[19:47:56] scottwood: Is there a general virt-to-phys function in u-boot?
[19:49:10] galak: nope
[19:49:27] scottwood: There probably should be.
[19:49:38] galak: agreed, but that a huge overhaul
[19:50:29] scottwood: How many other platforms have virt != phys?
[19:50:56] galak: not sure, probably none
[19:51:10] scottwood: So virt_to_phys could do tlbsx on book-e and return the argument on everything else.
[19:51:36] galak: yeah, we could do that
[19:52:23] galak: we should probably have it walk BATs on classic
[19:52:33] scottwood: yeah
[19:53:57] galak: what do you think the API for virt_to_phys() looks like?  (how to report error if no mapping exists?)
[19:54:54] scottwood: Could return ~0, or have the physical address returned via pointer.
[19:57:32] galak: ok..will work something up
[19:58:17] scottwood: thanks

Nov 26, 2008:

[20:09:43] beckyb: it's the whole virtual vs physical address question
[20:10:04] beckyb: when I did the 36b stuff that's in the tree now, I went for the minimally invasive approach
[20:10:27] beckyb: so, at the moment, all the command-line stuff takes a *virtual* address as an argument
[20:10:41] beckyb: I'm wondering if that's really what we want
[20:17:18] beckyb: wdenk_: The whole issue has come up because with galak's map_physmem patches, we have to actually start distinguishing between the 2 kinds of addresses inside u-boot
[20:17:50] beckyb: right now, the 36-bit code is taking advantage of the fact that u-boot doesn't know the difference
[20:18:20] beckyb: so the *physical* address only really exists right now in the MMU mapping
[20:18:33] beckyb: and in a few other places that actually care about the PA
[20:18:39] beckyb: I'll stop babbling now 
[20:20:39] wdenk_: beckyb: I have to admit that I don't see the immediate problem yet.
[20:20:56] beckyb: wdenk_: OK, I'll babble some more 
[20:21:00] wdenk_: beckyb: So far, everybody seems to be happy with using virtual addresses.
[20:21:01] beckyb: Let me use a concrete example
[20:21:14] beckyb: wdenk_: and I'm *fine* if we stick with that
[20:21:26] beckyb: I just want to be sure that's what we really want
[20:21:54] beckyb: I've been working on the flash code, which stores the physical sector address in a structure, then calls map_physmem to get a va *most* of the time it uses it
[20:22:16] beckyb: I've corrected all the places where it treats a PA as a pointer, no problem
[20:22:33] beckyb: but the "fli" command right now displays the physical address
[20:22:48] beckyb: if we're sticking with VAs on the command line, I'd like to change "fli" to display the VAs
[20:23:00] beckyb: so that the cp commands and fli take the same addrs
[20:23:06] wdenk_: Hm... I can't tell for sure what we want either. 
[20:23:14] beckyb: it's a tough problem
[20:23:25] beckyb: using the VA in "fli" is a much less invasive solution
[20:23:44] wdenk_: But I agree that in such cases where it makes a difference, we should agree to use one consistent set of addresses.
[20:24:05] beckyb: if we go to using the "pa" on the command line, the changes get fairly invasive
[20:24:21] beckyb: as we have to go to phys_addr_t and strtoull
[20:24:44] beckyb: using the va will work *as long as u-boot doesn't start doing paging*
[20:25:03] beckyb: because using the VA assumes that mapping from VA to PA exists
[20:25:24] beckyb: otoh, if we use the PA, we have to change all the commands to call map_physmem to get a VA before doing anything
[20:25:25] wdenk_: Right. And I guess we can push this point at least out of this decade 
[20:25:36] beckyb: sure, which is why I went with the VA for now
[20:25:47] beckyb: but before I made any more code changes, I wanted to talk with you about it
[20:26:20] wdenk_: I think we should ask this again on the ML - I'm easily overlooking some aspects here.
[20:26:36] beckyb: ok, I'll craft a note and send it out
[20:26:41] beckyb: thanks
[20:26:44] wdenk_: My gut feeling is that VA's are just fine. Hey, that's still a boot loader.
[20:26:51] beckyb: exactly
[20:27:22] beckyb: I'll probably word the note as "we're going with VAs on the command line unless somebody gives me a very good reason not to "
[20:27:23] beckyb: 
[20:27:34] wdenk_: But I also see that we might need to extend some commands to operate on the PA's as well - in some cases you really want to be able to "md" or "mw" directly to some PCI addresses or similar.
[20:27:43] beckyb: right, that makes sense
[20:28:01] beckyb: at the least, we could start with a translation command
[20:28:19] wdenk_: good idea, that's even less intrusive.
[20:28:22] beckyb: that gives you the PA for a VA, so you can invoke the commands correctly.  It's a quick and easy fix
[20:28:31] wdenk_: But then, it doesn't solve the problem. 
[20:28:42] beckyb: nope, the user still has to be aware
[20:28:52] wdenk_: Assume you have a 32 bit system with 36 bit PA's.
[20:28:52] beckyb: The problem is, we don't really know what the user is thinking now
[20:29:07] beckyb: which I do 
[20:29:10] wdenk_: How do you enter a PA then?
[20:29:40] beckyb: the translate command would have to be able to parse that, but all the other commands could still take the VA
[20:30:14] wdenk_: We have this case with the 440SPe "katmai" board with 4 GB of RAM.
[20:30:52] beckyb: right, and there are some limitations there currently
[20:31:04] beckyb: like you can't "mtest" all of RAM, because you can't map it all in
[20:31:11] beckyb: I haven't fixed that yet
[20:31:44] beckyb: but the rearranging of the memory map I did will make it easier to deal with on 8641
[20:31:48] wdenk_: "mtest" is the smallest problem. You always can run it on a documented range only.
[20:31:59] beckyb: sure, it's just an example
[20:32:51] beckyb: the fundamental point is, you're limited to accessing what you can map in 32 bits, unless we start making wholesale changes to u-boot
[20:33:10] beckyb: because software sees things at VAs
[20:35:17] wdenk_: So we agree to use all VA's in U-Boot, plus initially we will add an "xlat" command, and eventually (where and when needed) we might extend commands to recognize and handle PA's (something like "md xxxx" for VA versus "md .xxxx" for PA or similar) ?
[20:35:39] beckyb: something like that, yeah, or maybe md -p xxxx
[20:36:23] beckyb: and we need to start watching new code coming in very carefully, to start weaning people off this whole VA=PA assumption
[20:36:38] beckyb: for common stuff, anyway
[20:36:49] wdenk_: beckyb: I think it might be better to ad a marker to the address argument in case we should have commands where we have to pass more than one address and want to use different ones - like "cp" from VA to PA or vice versa.
[20:36:58] beckyb: ah, right
[20:37:10] beckyb: do any other u-boot commands do this?
[20:37:41] wdenk_: take more than one address? well, let me think.... bootm does.
[20:37:48] wdenk_: cmp
[20:37:54] beckyb: no, I mean that have some sort of marker in an arg
[20:38:04] beckyb: sorry, I wasn't clear
[20:38:41] wdenk_: I'm not sure. At least we do this in the command names - like "cp" versus "cp.b"
[20:39:00] beckyb: true..... so what if we had the marker be p.
[20:39:17] beckyb: The "." by itself is going to screw me up 
[20:39:25] wdenk_: ".p" then?
[20:39:38] wdenk_: md feeddeadbeef.p
[20:40:01] beckyb: ah, it's at the end now. Sure, works for me.  And we should also accept .v for completeness
[20:40:43] wdenk_: We can also use a "p." / "v." prefix - I'm not sure if I have any preferences yet 
[20:41:12] beckyb: yeah, we can work that out later   I'm sure the list will have preferences!
[20:41:59] beckyb: I'll try to get something out to the list this afternoon

Becky then posted the summary of this discussion here:

http://thread.gmane.org/gmane.comp.boot-loaders.u-boot/50705

Note that there was a general agreement among those who raised their
voices.

>         The "normal", or legacy,policy for u-boot is to arrange a memory 
> map such that all physical addresses are mapped to some virtual address. 
> Originally, the mapping were such that the VA was actually == the PA, 
> but today on some CPUs, this is not possible. When the size of the 
> physical address space exceeds the size of the virtual address space, 
> the VA may not =- the PA numerically, but there is a one-to-one 
> correspondence. It MAY also acceptable to map the same PA so that it 
> appears more than once in the address space (VA), but if this is done, 

This may or may not be possible. It may even make sense or be needed
on some systems, and it may be impossible to do on others.

In any case, I think we should be careful not to mix things: what  we
are  discussing here are address mappings. What we are not discussing
is specific memory properties like  being  cached/uncached,  guarded/
non-guarded, etc.

Such properties are important, too, but  they  need  to  get  handled
through a separate interface.

>       Becky Bruce "re-wrote the driver to use VAs instead of PAs." I am 
> not exactly sure what this means, but I assume it meant allowing the VA 
> referencing the flash
> to be distinct from the PA where the flash "lives" (and may require 36 
> bits on a PPC system to represent the PA). Does the driver re-map 

I think the information provided above sheds more light on this.

> portions of the flash in order to access them? If the flash is really 
> large, I can certainly see a need to do so. However, I assume on "medium 
> size" flashes, it does not need to remap. In that case, don't all 
> references just go through the MMU and get translated? The VA != PA, but 
> from the point of view of u-boot, the VA is the only address that 
> matters. The AVR32 certainly does not map flash dynamically, so it would 
> not matter on that CPU.

OK.

>       The issue with the CFI driver on the AVR32 is that it needs to 
> disable cache on the address space of the flash memory when it is 
> writing to the flash. This apparently is not trivial to do, but there is 

Actually this is not specific to the AVR32, and so far  most  systems
simply  do  not  enable  caches at all on the flash memory regions. I
understand why the AVR32 solution is interesting, and I  think,  when
we  try to find a solution for this we should use this chance to find
a solution that also allows other systems to turn on  caches  on  the
flash memory - things like loading the Linux kernel or ramdisk images
etc. will benefit from that.

> a second mapping that does have the cache off. Wolfgang has recommended 
> the creation of a function to turn off the cache for the flash area, and 
> also (presumably) one to turn the flash back on when the write is 
> complete. Haavard has at present a function that returns an alternate 

Right. My rationale for this was the wish to provide such a solution
for all systems, including those that don;t allow to have several
mappings (with different attributes) for the same physical memory
region.

> address with the cache already off that addresses the same memory. This 
> wouldn't cause a problem if the mapping happened immediately before the 
> actual copy operation took place and was used for nothing else. However, 
> if it happens early on in the driver, the address will not match the 
> structure set up by the rest of the flash code using the non-translated 
> address.

And this method will not work on all systems that don't support such
multiple mappings.

>         Therefore, I have the following questions: If the map_physmem() 
> macro is removed from the driver on the AVR32, does the driver work if 
> it is told that the flash PA=VA = the un-cached address? If not, why 

In the current state the situation PA=VA=un-cached is the default on
almost all systems, and is workign fine there.

> not? Shouldn't this be just like any CFI on an un-cached PPC address? 
> The driver will be somewhat slower reading but otherwise it should work. 
> If/when it does work, couldn't a map_in_cache() macro be placed directly 
> in front of the read code that copies data from flash to other buffers. 

While there are specific routines to  "write"  to  the  flash  (init,
erase,  write),  there  is  no  specific  code  to "read" from flash.
Reading is allowed everywhere by just  performing  load  instructions
from this memory area. The CFI driver (nor any other flash driver) is
needed  or involved to do that. That's the whole big advantage of NOR
flash (which makes it _memory_) over storage devices like  NAND  etc.
(which are _not_ memory).

You would have to add  this  macro  everywhere  -  on  anything  that
accesses  memory.  All  commands  that  take  memory addresses either
directly or indirectly, each and every load instruction.  That's  not
practical.  [OK,  you  could  probably set up the MMU to trap on read
accesses on such a memory reagion, but  that  would  not  exactly  be
simpler either.]

> The macro would return an address of the same data referenced through a 
> cached address if it exists. This address would go nowhere else and 
> never be stored anywhere. This would speed up the copy operation for 
> situations where it matters, and is applicable to all platforms that can 
> do such a thing. The most general solution would be a call to 
> map_in_cache/map_in_not_cache for both reads and writes in the CFI 

No. The CFI driver is not used for read operations, so this does not
work.

> driver. These routines would return a "substitute" address (or the same 
> input one), and may actually add another mapping dynamically, use a 
> pre-existing appropriate mapping, just turn on/turn off data cache 
> globally, or do nothing). At the end of the copy, map_restore() would 
> put the map back  By default, the assumption would be that the flash is 
> not cached and the macros do nothing. Sounds simple to me, what have I 
> overlooked?

The only thing you overlooked in my opinion is that read accesses to
NOR flash are plain memory accesses that are not handled by the CFI
or any other driver.

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
All repairs tend to destroy the structure, to  increase  the  entropy
and  disorder  of the system. Less and less effort is spent on fixing
original design flaws; more and more is spent on fixing flaws  intro-
duced by earlier fixes.       - Fred Brooks, "The Mythical Man Month"