[U-Boot] uboot for MIPS: need help to skip relocate uboot and start uboot from RAM

Thu Mar 10 18:37:20 CET 2011

Hi Aaron,

 Thanks you for such a detailed explanation. It was of a great help to me.

Thanks,
Pandu

On Thu, Mar 10, 2011 at 6:05 AM, Aaron Williams <
Aaron.Williams at caviumnetworks.com> wrote:

> Hi Pandurang,
>
> We solved this problem by using TLB mapping for U-Boot on our MIPS
> platforms.
>
> This was also due to the fact that we need to load U-Boot at the top of
> physical memory which is often unreachable with 32-bit addressing. By doing
> this we always link U-Boot at address 0xC0000000 and it doesn't care where
> it's actually loaded in physical memory. The only thing we have to be
> careful
> about is that any drivers that expect pointers to contain physical
> addresses
> need to use a macro for mapping (and handle the case where the physical
> address is 64-bits).
>
> We have to load U-boot at the top of physical memory since we have large
> applications and operating systems that need all available physical memory
> and
> some 32-bit operating systems need all the low physical memory they can
> get.
>
> This also allows us to use the same U-Boot image when booting over
> PCI/PCIE,
> JTAG or from multiple copies we store in flash (standard and failover).
> Early
> in the U-Boot process we check one of our GPIO lines to see if it's set or
> not. If set we just scan flash for a second copy of U-Boot and branch
> there,
> otherwise we continue executing and run in "failsafe" mode. With this we
> don't
> even care about the size of U-Boot, as long as it remains under 4MB we're
> fine.
>
> Granted, the "proper" way is to use ELF relocation, but that does not work
> when 64-bit physical addressing is required for phyisical memory and it is
> currently unsupported for MIPS.
>
> We first set up the TLB mapping in start.S after we discover where we are
> executing from in flash or DRAM then do it again in the code that copies U-
> Boot from flash to DRAM. The initialization code can also detect if it's
> already running out of RAM and skip the DRAM initialization as in the case
> where we boot over PCI/PCIE/EJTAG where the host initializes the memory
> controller. U-Boot no longer cares at all where it is executing from in
> physical memory.
>
> We just set CONFIG_SYS_MONITOR_BASE to 0xC0000000 and go from there.
>
> On MIPS this is actually quite simple and just requires a single entry in
> the
> TLB table. We use the last entry.
>
> We just have to be careful to clear the TLB entry before executing any
> application or operating system on core 0 (where U-Boot resides).
>
> We basically do this:
>
>        /* Set up TLB registers to clear desired entry.  The actual
>         * tlbwi instruction is done in ASM when running from unmapped DRAM
>         */
>        write_64bit_c0_entrylo0 (0);
>        write_c0_pagemask (0);
>        write_64bit_c0_entrylo1 (0);
>        write_64bit_c0_entryhi (0xFFFFFFFF91000000ull);
>        write_c0_index (get_num_tlb_entries () - 1);
>
>        asm volatile ("       .set push         \n"
>                      "       .set mips64       \n"
>                      "       .set noreorder    \n"
>                      "       move  $4, %[arg0] \n"
>                      "       move  $5, %[arg1] \n"
>                      "       move  $6, %[arg2] \n"
>                      "       move  $7, %[arg3] \n"
>                      "       move  $8, %[arg4] \n"
>                      "       j     %[addr]     \n"
>                      "       nop               \n"
>                      "       .set pop          \n"::[arg0] "r" (arg0),
>                      [arg1] "r" (arg1),
>                      [arg2] "r" (arg2),
>                      [arg3] "r" (arg3),[arg4] "r" (arg4),[addr] "r" (addr)
>                      :"$4", "$5", "$6", "$7", "$8");
>
> Which calls:
>
> /*
>  * Launch 64-bit Linux kernel entry point from a 32-bit U-boot
>  * a0-a3 normal args, set up by C code.  We never come back,
>  * so we keep this simple.
>  * a4 is entry point
>  * Calling C code sets up TLB to be ready for a write that clears the TLB
>  * entry that u-boot uses.  This code is executed from XKPHYS address space
>  * to allow the TLB entry to be removed.
>  */
>        .globl asm_launch_linux_entry_point
>        .ent   asm_launch_linux_entry_point
> asm_launch_linux_entry_point:
>        tlbwi
>        j       a4
>        cache   0, 0($0)              /* Flush icache in delay slot*/
>        .end   asm_launch_linux_entry_point
>
> Our relocate code basically looks like:
>
> /*
>  * void relocate_code (addr_sp, gd, addr_moni)
>  *
>  * This "function" does not return, instead it continues in RAM
>  * after relocating the monitor code.
>  *
>  * a0 = addr_sp
>  * a1 = gd address (on stack)
>  * a2 = destination address (physical)
>  */
>        .globl  relocate_code
>        .ent    relocate_code
> relocate_code:
>        la      t9, relocate_code_octeon
>        j       t9
>        move    a3, zero        /* No mapping */
>        .end    relocate_code
>
> /*
>  * void relocate_code_octeon (addr_sp, gd, addr_moni)
>  *
>  * This "function" does not return, instead it continues in RAM
>  * after relocating the monitor code.
>  *
>  * a0 = addr_sp
>  * a1 = gd address (on stack)
>  * a2 = destination address (physical)
>  * a3 = TLB page size (when TLB mapping used
>  */
>
>        .globl  relocate_code_octeon
>        .ent    relocate_code_octeon
> relocate_code_octeon:
>        move    v0, a1  /* Save gd address */
>        move    sp, a0          /* Set new stack pointer                */
>
>
>        li      a4, CONFIG_SYS_MONITOR_BASE /* Text base, 0xC0000000 */
>        la      a7, in_ram
>        lw      a6, -12(a7)     /* a6 <-- uboot_end_data        */
>        move    a5, a2
>
>        /*
>         * a4 = source address
>         * a5 = target address
>         * a6 = source end address
>         */
>
> /* Use 64 bit copies to relocate code for speed.  We need to be careful to
>  * not copy too much as BSS comes immediately after the initialized data,
>  * and bss clearing is done _before_ the copy, so if too much is copied we
> get
>  * garbage in some bss variable(s).
>  * The Linker script is constructed to align the end of the initialized
> data
>  * so that we can use 8 byte chunks.
>  */
>        beq     a4, a5, copyDone
> 1:
>        ld      a7, 0(a4)
>        sd      a7, 0(a5)
>        daddu   a4, 8
>        blt     a4, a6, 1b
>        daddu   a5, 8                   /* delay slot                   */
>
>        /* If caches were enabled, we would have to flush them here.
>         */
> copyDone:
>
>        /* Jump to where we've relocated ourselves.
>         */
>
>        /* We now need to redo the TLB.  We can call it directly
>         * since we are now running from the linked address.
>         */
>        /* Now replace the single TLB mapping that was set up in flash. */
>        move    a1, a2
>
>        la      a0, _start
>        /* Mapping size in a3 from above */
>        move    a2, a3
>        jal     single_tlb_setup
>        nop
>
>        /* We aren't changing execution (virtual) addresses,
>         * so we don't need any address fixups here.
>         */
>        la      a4, in_ram
>        j       a4
>        nop
>
>        .globl single_tlb_setup
>        .ent   single_tlb_setup
>        .align 8
>        /* Sets up a single TLB entry.  Virtual/physical addresses
>         * must be properly aligned.
>         * a0  Virtual address
>         * a1  Physical address
>         * a2  page (_not_ mapping) size
>         */
> single_tlb_setup:
>
>        /* Determine the number of TLB entries available, and
>         * use the top one.
>         */
>        mfc0    a3, COP0_CONFIG1_REG
>        srl     a3, a3, 25
>        mfc0    a5, COP0_CONFIG3_REG /* Check if config4 reg present */
>        bbit0   a5, 31, single_tlb_setup_cont
>        and     a3, a3, 0x3F         /* a3 now has the max mmu entry index
> */
>        mfc0    a5, COP0_CONFIG4_REG
>        bbit0   a5, 14, single_tlb_setup_cont   /* check config4[MMUExtDef]
> */
>        nop
>        /* append config4[MMUSizeExt] to most significant bit of
>         * config1[MMUSize-1]
>         */
>        ins     a3, a5, 6, 8
>        and     a3, a3, 0x3fff  /* a3 now includes max entries for cn6xxx */
>
> single_tlb_setup_cont:
>
>        /* Format physical address for entry low */
>        nop
>        dsrl    a1, a1, 12
>        dsll    a1, a1, 6
>        ori     a1, a1, 0x7     /* set DVG bits */
>
>        move    a4, a2
>        dadd    a5, a4, a4      /* mapping size */
>        dsll    a6, a4, 1
>        daddi   a6, a6, -1      /* pagemask */
>        dsrl    a4, a4, 6       /* adjust for adding with entrylo */
>
>        /* Now set up mapping */
>        mtc0    a6, COP0_PAGEMASK_REG
>        mtc0    a3, COP0_INDEX_REG
>
>        dmtc0   a1, COP0_ENTRYLO0_REG
>        dadd    a1, a1, a4
>
>        dmtc0   a1, COP0_ENTRYLO1_REG
>        dadd    a1, a1, a4
>
>        dmtc0   a0, COP0_ENTRYHI_REG
>        dadd    a0, a0, a5
>
>        ehb
>        tlbwi
>        jr  ra
>        nop
>        .end   single_tlb_setup
>
> Note that this code would have to be modified for other MIPS platforms
> since
> this makes use of some instructions not normally found (i.e. bbit0 (branch
> if
> bit clear) and ins (insert bits)). The code also uses the n32 ABI with
> 64-bit
> support enabled. It could easily be adapted to other MIPS platforms and
> could
> be used on 32-bit platforms as well.
>
> I've met a lot of resistance to the idea of using virtual memory for the
> boot
> loader, but it really simplifies things.
>
> The only thing we have to relocate is bd and gd where we copy them from
> cache
> to DRAM after we initialize DRAM. All the other relocations disappear so we
> just link at the fixed address.
>
> The only modifications we've had to do to U-Boot due to this is add macros
> to
> the USB EHCI driver and the E1000 driver (the last was only as an exercize)
> to
> map virtual addresses to physical addresses. Our platform drivers already
> take
> this into account. Any other drivers that perform DMA would also need to
> use
> macros to convert pointers to physical addresses, which really should be
> present anyway since KSEG0 pointer addresses are not always physical
> addresses.
>
> We can always load U-Boot into the top of memory, whether there's 256MB or
> 8+GB of physical memory installed without having to make major changes to
> U-
> Boot to make it fully support 64-bit addressing.
>
> With this there's no ELF relocation, fixups or anything else to worry about
> and it greatly simplifies things.
>
> -Aaron
>
> On Thursday, March 03, 2011 10:23:11 pm Wolfgang Denk wrote:
> > Dear Pandurang Kale,
> >
> > In message <AANLkTinTqxJPU9Gwye_8pT2PcUxR8E36=zm78ypc1740 at mail.gmail.com
> >
> you wrote:
> > > For MIPS I do not find the TEXT_BASE symbol, there is
> > > SYS_CFG_MONITOR_BASE
> >
> > Please check again. MIPS uses CONFIG_SYS_TEXT_BASE like all other
> > architectures.
> >
> > > which it uses to relocate the code from the define symbol to high RAM
> > > address. how can I avoid this?  As I see we have a switch defined for
> > > ARM,
> >
> > You should not try to avoid this.  It is a useful feature, even if you
> > load U-Boot to RAM separately.
> >
> > > CONFIG_SKIP_RELOCATE_UBOOT,  to skip the code relocation I cant find a
> > > similar instance in MIPS code. Can you please throw some light on
> getting
> > > the TEXT_BASE setting correctly for MIPS code? how can I do that?
> >
> > You did not understand what I wrote:
> > > > > I can see there is a switch for ARM processor,
> > > >
> > > > CONFIG_SKIP_RELOCATE_UBOOT,
> > > >
> > > > Are you looking at recent code and working boards?
> > >
> > > I have recent uboot code for MIPS and I cant find any similar switch
> for
> > > MIPS codebase. arch/mips/lib/board.c and arch/mips/cpu/start.S
> >
> > I meant: do you see CONFIG_SKIP_RELOCATE_UBOOT in recent ARM code, on
> > working (compilable) ARM boards?
> >
> > > > Do do not want to skip relocation.  U-Boot may need to auto-adjust
> > > > it's start address dynamically, depending on configuration, system
> > > > requirements and/or environment settings.
> > >
> > > The uboot is already loaded in the RAM (by the primary boot loader) so
> I
> > > dont want uboot to again relocate itself from one location of RAM to
> its
> > > predefined high-memory region in RAM which I have explained in my first
> > > mail.
> >
> > Please re-read what I wrote.  In general, U-Boot's load address cannot
> > be determined at compile time, at least not without crippeling it from
> > some interesting features.  You should really not try doing things
> > differently to everybody else.  We had similar discussins not so long
> > ago for AMR, so please just re-read this in the archives.
> >
> > Best regards,
> >
> > Wolfgang Denk
>