[U-Boot] [PATCH] Fix unreliable detection of DRAM size on Orange Pi 3

Ondřej Jirman megous at megous.com
Fri Aug 30 10:56:28 UTC 2019


Hello,

On Fri, Aug 30, 2019 at 01:44:34AM +0100, André Przywara wrote:
> On 25/08/2019 15:41, Siarhei Siamashka wrote:
> > On Sat, 24 Aug 2019 22:07:43 +0200
> > Ondřej Jirman <megous at megous.com> wrote:
> 
> Hi,
> 
> >> Hi Jagan,
> >>
> >> can you please apply this patch to the sunxi tree, so that it
> >> doesn't get lost.
> >>
> >> thank you,
> >> 	Ondrej
> >>
> >> On Mon, Jul 29, 2019 at 01:39:42AM +0200, megous hlavni wrote:
> >>> From: Ondrej Jirman <megous at megous.com>
> >>>
> >>> Orange Pi 3 has 2 GiB of DRAM, that sometime get misdetected
> >>> as 4 GiB, due to false negative result from mctl_mem_matches()
> >>> when detecting number of column address bits. This leads to
> >>> u-boot detecting more address bits than there are and the
> >>> boot process hangs shortly after.
> >>>
> >>> In mctl_mem_matches() we need to wait for each write to finish,
> >>> separately. Without this, the check is not reliable for some
> >>> unknown reason, probably having to do with unpredictable memory
> >>> access ordering.
> >>>
> >>> Patch was made with help from André Przywara, who noticed that
> >>> my original idea about detection failing due to read-back from
> >>> cache without involving DRAM was false, because data cache is
> >>> still of at the time of the DRAM size autodetection.
> >>>
> >>> Signed-off-by: Ondrej Jirman <megous at megous.com>
> >>> Cc: André Przywara <andre.przywara at arm.com>
> >>> ---
> >>>  arch/arm/mach-sunxi/dram_helpers.c | 1 +
> >>>  1 file changed, 1 insertion(+)
> >>>
> >>> diff --git a/arch/arm/mach-sunxi/dram_helpers.c b/arch/arm/mach-sunxi/dram_helpers.c
> >>> index 239ab421a8..6dba448638 100644
> >>> --- a/arch/arm/mach-sunxi/dram_helpers.c
> >>> +++ b/arch/arm/mach-sunxi/dram_helpers.c
> >>> @@ -30,6 +30,7 @@ bool mctl_mem_matches(u32 offset)
> >>>  {
> >>>  	/* Try to write different values to RAM at two addresses */
> >>>  	writel(0, CONFIG_SYS_SDRAM_BASE);
> >>> +	dsb();
> >>>  	writel(0xaa55aa55, (ulong)CONFIG_SYS_SDRAM_BASE + offset);
> >>>  	dsb();
> >>>  	/* Check if the same value is actually observed when reading back */
> >>> -- 
> > 
> > Hi!
> > 
> > This looks like resurfacing of the old problem, which did not get a
> > complete fix back in 2016:
> >    https://lists.denx.de/pipermail/u-boot/2016-April/251803.html
> > 
> > Now the main question is whether the DSB instruction's barrier
> > magic is really required here or maybe it's just a timing issue.
> 
> I think both:
> From the compiler's and the CPU's point of view these two stores look
> independent: they don't have any data dependency and go to different
> addresses. So both the compiler and the CPU are allowed to reorder these
> stores. However, we are after some non-obvious aliasing effect, so in
> fact there *is* a dependency between the two stores. So architecturally
> this barrier is definitely required, to notify everyone about this
> dependency (although I think a dmb(); might be sufficient here).

OTOH, let's visualize it:

CPU address                             DRAM address
---------------------------------------------------------------------------
[w 00000000] SDRAM_BASE                 0
[w aa55aa55] SDRAM_BASE + offset        0

Both writes should end up at the same physical location in dram).

The following test checks that values that are read back are the same. They
should be the same in this specific case of Opi3 SBC. It doesn't check for any
specific value, so even if writes are re-ordered, the value that's read back
should still be the same.

Order of operations is:

  writel(0, CONFIG_SYS_SDRAM_BASE);
  writel(0xaa55aa55, (ulong)CONFIG_SYS_SDRAM_BASE + offset);
  dsb();
  readl(CONFIG_SYS_SDRAM_BASE)
  readl((ulong)CONFIG_SYS_SDRAM_BASE + offset);

The CPU can only re-order either the writes or reads. But neither should matter.

Memory controller is a black box to me and that's where the issue probably
lies.

The only time this should fail and return different values when reading is if
the hardware does some shortcut/assumption, and reads will not go fully from
DRAM after the writes.

Anyway, I'm failing to reproduce this even without any patches, ATM.

regards,
	o.

> So for the sake of this patch, we should take it. It is needed and
> apparently solves one issue, and I can't see any harm in it.
> 
> Regardless of this I was wondering if in this situation (single in-order
> core running with MMU and caches off) this is the full story, as the CPU
> does not have a good reason to reorder.
> 
> > Could you please experiment with this problem a little bit more?
> > 
> >  1. What if you just move this second DSB instruction after the
> >     second write and have two of them there? Does this also make
> >     the problem disappear?
> 
> I don't think two dsb()s next to each other will do anything useful. I
> would think the second dsb() would just be ignored, as the order is
> already preserved and all memory accesses have been completed:
> "A DSB is a memory barrier that ensures that memory accesses that occur
> before the DSB have completed before the completion of the DSB
> instruction." (ARMv8 ARM, section B2.3.5: DSB)
> I can't imagine this would introduce any kind of relevant delay, as a
> DSB is just a barrier.
> 
> >  2. What if you just replace DSB with udelay(1) / udelay(10) /
> >     udelay(100)? Does this make the problem disappear?>
> > 
> > I suspect that we may be dealing with some sort of buffering in
> > the DRAM controller itself
> 
> Indeed it seems perfectly possible that the DDR3 controller buffers and
> potentially reorders those two writes.
> The canonical solution would to force the write by telling the memory
> controller directly, however this is probably not only controller
> specific, but also undocumented.
> 
> I was wondering whether waiting for the next refresh would help (or for
> the period of one refresh cycle), but I guess the DRAM controller could
> buffer requests beyond that point.
> 
> > and the only reason why DSB helps is
> > that it introduces some delay because it's a rather slow
> > instruction.
> 
> How much delay a DSB instruction introduces, is dependent on many
> things, but mostly on the number of outstanding memory requests and the
> time it takes them to complete them. So I am not sure we can call it a
> "slow instruction" per se.
> 
> So to summarise: I think Siarhei has a point in that this is actually
> more of a timing issue, but the DSB in there is needed and we should
> apply this patch:
> 
> Reviewed-by: Andre Przywara <andre.przywara at arm.com>
> 
> Cheers,
> Andre.
> 


More information about the U-Boot mailing list