[U-Boot] SoCFPGA PL330 DMA driver and ECC scrubbing

Jason Rush jarush at gmail.com
Wed Jul 11 03:11:32 UTC 2018


On 7/10/2018 11:11 AM, Marek Vasut wrote:
> On 07/10/2018 02:10 PM, Jason Rush wrote:
>> On 7/9/2018 3:08 AM, Marek Vasut wrote:
>>> On 07/07/2018 12:56 AM, Jason Rush wrote:
>>>> On 7/5/2018 6:10 PM, Marek Vasut wrote:
>>>>> On 07/06/2018 01:11 AM, Jason Rush wrote:
>>>>>> On 7/4/2018 2:23 AM, Marek Vasut wrote:
>>>>>>> On 07/04/2018 01:45 AM, Jason Rush wrote:
>>>>>>>> On 7/3/2018 9:08 AM, Marek Vasut wrote:
>>>>>>>>> On 07/03/2018 03:58 PM, Jason Rush wrote:
>>>>>>>>>> On 6/29/2018 10:17 AM, Marek Vasut wrote:
>>>>>>>>>>> On 06/29/2018 05:06 PM, Jason Rush wrote:
>>>>>>>>>>>> On 6/29/2018 9:52 AM, Marek Vasut wrote:
>>>>>>>>>>>>> On 06/29/2018 04:44 PM, Jason Rush wrote:
>>>>>>>>>>>>>> On 6/29/2018 9:34 AM, Marek Vasut wrote:
>>>>>>>>>>>>>>> On 06/29/2018 04:31 PM, Jason Rush wrote:
>>>>>>>>>>>>>>>> Dinh,
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> A while ago, you posted the following patchset for SoCFPGA to add the PL330
>>>>>>>>>>>>>>>> DMA driver, and updated the SoCFPGA SDRAM init to write zeros to SDRAM to
>>>>>>>>>>>>>>>> initialize the ECC bits if ECC was enabled:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> https://lists.denx.de/pipermail/u-boot/2016-October/269643.html
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I know it's been a long time, so I'll summarize some of the conversation...
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> At the time, you had a problem with the patchset causing the SPL to fail to
>>>>>>>>>>>>>>>> find the MMC.  You had tracked it down to an issue with the following commit
>>>>>>>>>>>>>>>> "a78cd8613204 ARM: Rework and correct barrier definitions".  You and Marek
>>>>>>>>>>>>>>>> discussed it a bit, but I don't think there was a real conclusion.  You
>>>>>>>>>>>>>>>> submitted a second version of the patchset asking for advice on debugging
>>>>>>>>>>>>>>>> the issue:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> https://lists.denx.de/pipermail/u-boot/2016-December/275822.html
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> No real conversation came from the second patchset, and that was the end of
>>>>>>>>>>>>>>>> the patch.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I was hoping we could revisit adding your patchset again. I am working on a
>>>>>>>>>>>>>>>> custom SoCFPGA board with a Cyclone V and ECC SDRAM. I rebased your patchset
>>>>>>>>>>>>>>>> against v2018.05 and it is working on my custom board (although I don't have
>>>>>>>>>>>>>>>> an MMC). I also tested it on a SoCKit booting from an MMC (I forced it to
>>>>>>>>>>>>>>>> scrub the SDRAM on the SoCKit, because it doesn't have ECC RAM), and the
>>>>>>>>>>>>>>>> SoCKit finds the MMC and boots.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I don't have any suggestions on why it is working now on my board and not
>>>>>>>>>>>>>>>> back when you first submitted the patchset.  Maybe something else was fixed
>>>>>>>>>>>>>>>> in the MMC? I was hoping you and Marek could test this patch again on some
>>>>>>>>>>>>>>>> different SoCFPGA boards to see if you get the same results.
>>>>>>>>>>>>>>> Look at this patch
>>>>>>>>>>>>>>> http://git.denx.de/?p=u-boot/u-boot-socfpga.git;a=commit;h=9bb8a249b292d26f152c20e3641600b3d7b3924b
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> You likely want similar approach, it's faster then the DMA and much simpler.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks Marek.  I'll give it a try.  Would you be interested in a similar patch for the Gen 5?
>>>>>>>>>>>>> I don't have any Gen5 board which uses ECC, do you ?
>>>>>>>>>>>>> If so, yes, prepare a patch, it should be very similar.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Make sure to measure how long it takes to scrub the memory and how much
>>>>>>>>>>>>> memory you have, I'd be interested in the numbers.
>>>>>>>>>>>>>
>>>>>>>>>>>> Looking at the master branch, it doesn't look like that code is ever being called?
>>>>>>>>>>>> The sdram_init_ecc_bits() function is called from the ddr_calibration_sequence function(),
>>>>>>>>>>>> but I can't find where ddr_calibration_sequence is called().
>>>>>>>>>>> git grep for it, it's called from somewhere in the arch/arm/mach-socfpga/
>>>>>>>>>>>
>>>>>>>>>>>> Either way, I can test it. I have a custom Cyclone V board with ECC, and the Intel Arria V SoC
>>>>>>>>>>>> Dev Kit I can test it on too which I think has ECC.
>>>>>>>>>>> Please do.
>>>>>>>>>>>
>>>>>>>>>> I implemented a similar memset approach for the gen 5 socfpga.  It's basically the same
>>>>>>>>>> code as in that patch; however, when I performed a single memset the processor would
>>>>>>>>>> reset for some reason.  I changed it to loop over calling memset with a size of 32MB over
>>>>>>>>>> the entire address the address, and that worked as opposed to doing a single memset on
>>>>>>>>>> the RAM.
>>>>>>>>> Can you do grep MEMSET .config in your U-Boot build dir ? The arch
>>>>>>>>> memset is implemented in assembler and doesn't trigger WDT , so if it
>>>>>>>>> takes too long, it could be that the WDT resets the platform.
>>>>>>>> Both CONFIG_USE_ARCH_MEMSET and CONFIG_SPL_USE_ARCH_MEMSET
>>>>>>>> are set in my .config, so it must be the WDT triggering as you suspect.
>>>>>>>>
>>>>>>>>>> I started on a SoCKit because it was handy, I know it doesn't have ECC
>>>>>>>>> It doesn't by default.
>>>>>>>>>
>>>>>>>>>> , but I forced it to
>>>>>>>>>> initialize the RAM as a quick test.  It seems much slower than the DMA approach.  It
>>>>>>>>>> should be noted, I didn't implement any code to time the scrubbing, but rather just
>>>>>>>>>> roughly monitored the time to get a rough idea of how long it took.
>>>>>>>>>>
>>>>>>>>>> On the SoCKit, which has 1GB of RAM, the memset takes around 8 seconds to complete,
>>>>>>>>>> and the DMA takes under 2 seconds.
>>>>>>>>> Did you enable i/d cache in the SPL ? It's mandatory, otherwise it's
>>>>>>>>> slow.
>>>>>>>> I have calls to icache_enable() and dcache_enable() just as you do in
>>>>>>>> the Arria 10 sdram_init_ecc_bits() function.
>>>>>>>>
>>>>>>>> I did double check that both these enable functions call the versions
>>>>>>>> of the functions in the ./arch/arm/lib/cache-cp15.c file that are
>>>>>>>> implemented in the SPL.  So I believe that both icache and dcache is
>>>>>>>> enabled.
>>>>>>> Are you sure it's not just the stubs that are called ? Or that the code
>>>>>>> doesn't skip the dcache enabling due to some funny stuff, like MMU being
>>>>>>> already enabled ?
>>>>>> I added prints to ensure it is calling the real icache_enable()/dcache_enable()
>>>>>> functions, and not the stubs.
>>>>>>
>>>>>>>> I probably should have added a print of icache_status() and
>>>>>>>> dcache_status() to verify the caches are enabled.  I'll add that
>>>>>>>> tomorrow.
>>>>>>> Yes, you really should verify that the dcache was enabled.
>>>>>>>
>>>>>>>>> Just be careful about the MMU tables placement, they are big and
>>>>>>>>> if you place them in RAM, make sure you don't overwrite them with the
>>>>>>>>> memset. The trick might be to memset the first 1 MiB of RAM, then put
>>>>>>>>> MMU tables at some offset therein (since 0x0 can be used for ARM
>>>>>>>>> vectors) and then turn on i/d cache and memset the rest.
>>>>>>>> That is essentially what I am doing I believe, with the exception that I
>>>>>>>> am only clearing the first 32KiB before initializing the MMU table (which
>>>>>>>> is what you did in the Arria 10 version).
>>>>>>>>
>>>>>>>> I modeled my code almost identically to yours with the exception that
>>>>>>>> I loop over the memset calls 32MiB at a time. Here's the order of
>>>>>>>> operations I perform:
>>>>>>>>
>>>>>>>> 1. icache_enable()
>>>>>>>> 2. memset the first 0x8000 bytes to zero
>>>>>>>> 3. setup gd->arch.tlb_arch and gd->arch.tlb_size
>>>>>>>> 4. dcache_enable()
>>>>>>>> 5. loop over remaining memory, memsetting 32MiB at a time to zero
>>>>>>>> 6. flush_dcache_all()
>>>>>>>> 7. dcache_disable()
>>>>>>>>
>>>>>>>> It looks like the call to dcache_enable is what sets up the MMU tables.
>>>>>>>> I suspect that's why you did a memset of the first 32KiB before enabling
>>>>>>>> the dcache on the Arria 10.  I think the MMU is initialized okay since the
>>>>>>>> SPL keeps executing, u-boot loads, and Linux boots after running the
>>>>>>>> above (maybe that's not a fair assumption).
>>>>>>> I had to write zeroes to the first 32kiB to init the ECC counters before
>>>>>>> putting MMU tables there.
>>>>>>>
>>>>>>> You really should double check if the MMU and dcache are enabled, 8
>>>>>>> seconds to scrub the memory is too long I think.
>>>>>> I added checks to verify that the MMU, icache, and dcache are all setup and
>>>>>> enabled.
>>>>>>
>>>>>> Calling icache_enable() set the CR_I bit (Icache enable) in the CR (control
>>>>>> register).  Then calling dcache_enable() called the mmu_setup() function,
>>>>>> which setup the MMU and set the CR_M bit (MMU enable) in the CR, and
>>>>>> finally dcache_enable() set the CR_C bit (Dcache enable) bit in the CR.
>>>>>>
>>>>>> I also printed out the control register before the memset calls, and it
>>>>>> indicated that the mmu, icache, and dcache were enabled.
>>>>> Is the DRAM area set as cacheable in the MMU tables ?
>>>>>
>>>> Good news bad news...  The MMU tables weren't being set up because the
>>>> bd->bi_dram[bank].start and bd->bi_dram[bank].size weren't set up.  As a quick
>>>> test, I hardcoded start to 0 and size to 1GiB.  After that, the memset was
>>>> really quick, U-Boot loads, Linux loads, and everything seems to work great.
>>> Good.
>>>
>>>> However, if I press the HPS_RST push button on the SoCKit (which is connected
>>>> to power on reset), occasionally U-Boot will lock up while booting.  It always
>>>> boots and operates correctly from the initial power on, but it almost always
>>>> fails to boot after pressing the HPS_RST button.
>>>>
>>>> Usually after pressing the HPS_RST button, U-Boot makes it past the SPL, and
>>>> hangs somewhere after the call to setup_reloc() in ./common/board_f.c.  Once
>>>> it hangs there, pressing the HPS_RST button again usually causes the SPL to
>>>> hang while setting up the MMU (before my call to memset).  Eventually the
>>>> WDT kicks in, and it just keeps hanging up in the same place.  Once it gets in
>>>> this mode, the only way to recover it is by toggling power on the board.
>>>>
>>>> I spent a bunch of time today trying to track down where it was hanging, but
>>>> I couldn't pin point anything.  The MMU tables looked correct.  The MMU
>>>> registers looked good.  I'm not sure the best way to debug what's going on.
>>> Try triggering warm reset and cold reset via the reset register:
>>>
>>> mw 0xffd05004 1
>>> mw 0xffd05004 2
>>>
>>> Does it hang in one case and not in the other ?
>>>
>> It hangs in both cases.
>>
>> I did find that if I do not metset the last 1MiB of DRAM with the cache on,
>> both warm and cold resets work.
>>
>> I changed the ecc scrubbing to zero out the first 0x8000 bytes and the last
>> 0x10000 bytes before the MMU is setup and I enable dcache.  Then with
>> the dcache enabled, I zero out the rest of memory.  The resets work in this
>> case as well.  So there seems to be some side effect of clearing out the
>> relocate address space with the cache on.
> Can you investigate ?
>
I'd be happy to investigate more, but I'm not really sure what
my next step should be.

Something appears to be happening differently when U-Boot
relocates if the dcache is on.  But don't know how to track it
down.

I was thinking I might dump the DRAM where U-Boot relocates
to both with the dcache on and off, and see if there are any
differences.  I'm not really sure what that tells me though if I
find a difference.

Any suggestions?

Regards,
Jason



More information about the U-Boot mailing list