[U-Boot] uboot for MIPS: need help to skip relocate uboot and start uboot from RAM

Aaron Williams Aaron.Williams at caviumnetworks.com
Thu Mar 10 07:05:07 CET 2011


Hi Pandurang,

We solved this problem by using TLB mapping for U-Boot on our MIPS platforms.

This was also due to the fact that we need to load U-Boot at the top of 
physical memory which is often unreachable with 32-bit addressing. By doing 
this we always link U-Boot at address 0xC0000000 and it doesn't care where 
it's actually loaded in physical memory. The only thing we have to be careful 
about is that any drivers that expect pointers to contain physical addresses 
need to use a macro for mapping (and handle the case where the physical 
address is 64-bits).

We have to load U-boot at the top of physical memory since we have large 
applications and operating systems that need all available physical memory and 
some 32-bit operating systems need all the low physical memory they can get.

This also allows us to use the same U-Boot image when booting over PCI/PCIE, 
JTAG or from multiple copies we store in flash (standard and failover). Early 
in the U-Boot process we check one of our GPIO lines to see if it's set or 
not. If set we just scan flash for a second copy of U-Boot and branch there, 
otherwise we continue executing and run in "failsafe" mode. With this we don't 
even care about the size of U-Boot, as long as it remains under 4MB we're 
fine.

Granted, the "proper" way is to use ELF relocation, but that does not work 
when 64-bit physical addressing is required for phyisical memory and it is 
currently unsupported for MIPS.

We first set up the TLB mapping in start.S after we discover where we are 
executing from in flash or DRAM then do it again in the code that copies U-
Boot from flash to DRAM. The initialization code can also detect if it's 
already running out of RAM and skip the DRAM initialization as in the case 
where we boot over PCI/PCIE/EJTAG where the host initializes the memory 
controller. U-Boot no longer cares at all where it is executing from in 
physical memory.

We just set CONFIG_SYS_MONITOR_BASE to 0xC0000000 and go from there.

On MIPS this is actually quite simple and just requires a single entry in the 
TLB table. We use the last entry.

We just have to be careful to clear the TLB entry before executing any 
application or operating system on core 0 (where U-Boot resides).

We basically do this:

	/* Set up TLB registers to clear desired entry.  The actual
	 * tlbwi instruction is done in ASM when running from unmapped DRAM
	 */
	write_64bit_c0_entrylo0 (0);
	write_c0_pagemask (0);
	write_64bit_c0_entrylo1 (0);
	write_64bit_c0_entryhi (0xFFFFFFFF91000000ull);
	write_c0_index (get_num_tlb_entries () - 1);

	asm volatile ("       .set push         \n"
		      "       .set mips64       \n"
		      "       .set noreorder    \n"
		      "       move  $4, %[arg0] \n"
		      "       move  $5, %[arg1] \n"
		      "       move  $6, %[arg2] \n"
		      "       move  $7, %[arg3] \n"
		      "       move  $8, %[arg4] \n"
		      "       j     %[addr]     \n"
		      "       nop               \n"
		      "       .set pop          \n"::[arg0] "r" (arg0),
		      [arg1] "r" (arg1),
		      [arg2] "r" (arg2),
		      [arg3] "r" (arg3),[arg4] "r" (arg4),[addr] "r" (addr)
		      :"$4", "$5", "$6", "$7", "$8");

Which calls:

/*
 * Launch 64-bit Linux kernel entry point from a 32-bit U-boot
 * a0-a3 normal args, set up by C code.  We never come back,
 * so we keep this simple.
 * a4 is entry point
 * Calling C code sets up TLB to be ready for a write that clears the TLB
 * entry that u-boot uses.  This code is executed from XKPHYS address space
 * to allow the TLB entry to be removed.
 */
        .globl asm_launch_linux_entry_point
        .ent   asm_launch_linux_entry_point
asm_launch_linux_entry_point:
        tlbwi
        j       a4
        cache   0, 0($0)              /* Flush icache in delay slot*/
        .end   asm_launch_linux_entry_point

Our relocate code basically looks like:

/*
 * void relocate_code (addr_sp, gd, addr_moni)
 *
 * This "function" does not return, instead it continues in RAM
 * after relocating the monitor code.
 *
 * a0 = addr_sp
 * a1 = gd address (on stack)
 * a2 = destination address (physical)
 */
        .globl  relocate_code
        .ent    relocate_code
relocate_code:
	la	t9, relocate_code_octeon
	j	t9
	move	a3, zero	/* No mapping */
	.end    relocate_code

/*
 * void relocate_code_octeon (addr_sp, gd, addr_moni)
 *
 * This "function" does not return, instead it continues in RAM
 * after relocating the monitor code.
 *
 * a0 = addr_sp
 * a1 = gd address (on stack)
 * a2 = destination address (physical)
 * a3 = TLB page size (when TLB mapping used
 */

	.globl	relocate_code_octeon
	.ent	relocate_code_octeon
relocate_code_octeon:
        move    v0, a1  /* Save gd address */
        move    sp, a0          /* Set new stack pointer                */


        li      a4, CONFIG_SYS_MONITOR_BASE /* Text base, 0xC0000000 */
        la      a7, in_ram
        lw      a6, -12(a7)     /* a6 <-- uboot_end_data        */
        move    a5, a2

        /*
         * a4 = source address
         * a5 = target address
         * a6 = source end address
         */

/* Use 64 bit copies to relocate code for speed.  We need to be careful to
 * not copy too much as BSS comes immediately after the initialized data,
 * and bss clearing is done _before_ the copy, so if too much is copied we get
 * garbage in some bss variable(s).
 * The Linker script is constructed to align the end of the initialized data
 * so that we can use 8 byte chunks.
 */
        beq     a4, a5, copyDone
1:
        ld      a7, 0(a4)
        sd      a7, 0(a5)
        daddu   a4, 8
        blt     a4, a6, 1b
        daddu   a5, 8                   /* delay slot                   */

        /* If caches were enabled, we would have to flush them here.
         */
copyDone:

        /* Jump to where we've relocated ourselves.
         */

        /* We now need to redo the TLB.  We can call it directly
         * since we are now running from the linked address.
         */
        /* Now replace the single TLB mapping that was set up in flash. */
        move    a1, a2

        la      a0, _start
        /* Mapping size in a3 from above */
        move    a2, a3
        jal     single_tlb_setup
        nop

        /* We aren't changing execution (virtual) addresses,
         * so we don't need any address fixups here.
         */
        la      a4, in_ram
        j       a4
        nop

        .globl single_tlb_setup
        .ent   single_tlb_setup
        .align 8
        /* Sets up a single TLB entry.  Virtual/physical addresses
         * must be properly aligned.
         * a0  Virtual address
         * a1  Physical address
         * a2  page (_not_ mapping) size
         */
single_tlb_setup:

        /* Determine the number of TLB entries available, and
         * use the top one.
	 */
        mfc0    a3, COP0_CONFIG1_REG
        srl     a3, a3, 25
        mfc0    a5, COP0_CONFIG3_REG /* Check if config4 reg present */
        bbit0   a5, 31, single_tlb_setup_cont
        and     a3, a3, 0x3F         /* a3 now has the max mmu entry index */
        mfc0    a5, COP0_CONFIG4_REG
        bbit0   a5, 14, single_tlb_setup_cont	/* check config4[MMUExtDef] */
        nop
        /* append config4[MMUSizeExt] to most significant bit of
	 * config1[MMUSize-1]
	 */
        ins     a3, a5, 6, 8
        and     a3, a3, 0x3fff	/* a3 now includes max entries for cn6xxx */

single_tlb_setup_cont:

        /* Format physical address for entry low */
        nop
        dsrl    a1, a1, 12
        dsll    a1, a1, 6
        ori     a1, a1, 0x7	/* set DVG bits */

        move    a4, a2
        dadd    a5, a4, a4	/* mapping size */
        dsll    a6, a4, 1
        daddi   a6, a6, -1	/* pagemask */
        dsrl    a4, a4, 6	/* adjust for adding with entrylo */

        /* Now set up mapping */
        mtc0    a6, COP0_PAGEMASK_REG
        mtc0    a3, COP0_INDEX_REG

        dmtc0   a1, COP0_ENTRYLO0_REG
        dadd    a1, a1, a4

        dmtc0   a1, COP0_ENTRYLO1_REG
        dadd    a1, a1, a4

        dmtc0   a0, COP0_ENTRYHI_REG
        dadd    a0, a0, a5

        ehb
        tlbwi
        jr  ra
        nop
        .end   single_tlb_setup

Note that this code would have to be modified for other MIPS platforms since 
this makes use of some instructions not normally found (i.e. bbit0 (branch if 
bit clear) and ins (insert bits)). The code also uses the n32 ABI with 64-bit 
support enabled. It could easily be adapted to other MIPS platforms and could 
be used on 32-bit platforms as well.

I've met a lot of resistance to the idea of using virtual memory for the boot 
loader, but it really simplifies things.

The only thing we have to relocate is bd and gd where we copy them from cache 
to DRAM after we initialize DRAM. All the other relocations disappear so we 
just link at the fixed address.

The only modifications we've had to do to U-Boot due to this is add macros to 
the USB EHCI driver and the E1000 driver (the last was only as an exercize) to 
map virtual addresses to physical addresses. Our platform drivers already take 
this into account. Any other drivers that perform DMA would also need to use 
macros to convert pointers to physical addresses, which really should be 
present anyway since KSEG0 pointer addresses are not always physical 
addresses.

We can always load U-Boot into the top of memory, whether there's 256MB or 
8+GB of physical memory installed without having to make major changes to U-
Boot to make it fully support 64-bit addressing.

With this there's no ELF relocation, fixups or anything else to worry about 
and it greatly simplifies things.

-Aaron

On Thursday, March 03, 2011 10:23:11 pm Wolfgang Denk wrote:
> Dear Pandurang Kale,
> 
> In message <AANLkTinTqxJPU9Gwye_8pT2PcUxR8E36=zm78ypc1740 at mail.gmail.com> 
you wrote:
> > For MIPS I do not find the TEXT_BASE symbol, there is
> > SYS_CFG_MONITOR_BASE
> 
> Please check again. MIPS uses CONFIG_SYS_TEXT_BASE like all other
> architectures.
> 
> > which it uses to relocate the code from the define symbol to high RAM
> > address. how can I avoid this?  As I see we have a switch defined for
> > ARM,
> 
> You should not try to avoid this.  It is a useful feature, even if you
> load U-Boot to RAM separately.
> 
> > CONFIG_SKIP_RELOCATE_UBOOT,  to skip the code relocation I cant find a
> > similar instance in MIPS code. Can you please throw some light on getting
> > the TEXT_BASE setting correctly for MIPS code? how can I do that?
> 
> You did not understand what I wrote:
> > > > I can see there is a switch for ARM processor,
> > > 
> > > CONFIG_SKIP_RELOCATE_UBOOT,
> > > 
> > > Are you looking at recent code and working boards?
> > 
> > I have recent uboot code for MIPS and I cant find any similar switch for
> > MIPS codebase. arch/mips/lib/board.c and arch/mips/cpu/start.S
> 
> I meant: do you see CONFIG_SKIP_RELOCATE_UBOOT in recent ARM code, on
> working (compilable) ARM boards?
> 
> > > Do do not want to skip relocation.  U-Boot may need to auto-adjust
> > > it's start address dynamically, depending on configuration, system
> > > requirements and/or environment settings.
> > 
> > The uboot is already loaded in the RAM (by the primary boot loader) so I
> > dont want uboot to again relocate itself from one location of RAM to its
> > predefined high-memory region in RAM which I have explained in my first
> > mail.
> 
> Please re-read what I wrote.  In general, U-Boot's load address cannot
> be determined at compile time, at least not without crippeling it from
> some interesting features.  You should really not try doing things
> differently to everybody else.  We had similar discussins not so long
> ago for AMR, so please just re-read this in the archives.
> 
> Best regards,
> 
> Wolfgang Denk


More information about the U-Boot mailing list