[U-Boot] i.MX51: FEC: Cache coherency problem?
David Jander
david.jander at protonic.nl
Thu Jul 21 08:48:03 CEST 2011
On Wed, 20 Jul 2011 08:36:12 -0700
"J. William Campbell" <jwilliamcampbell at comcast.net> wrote:
> On 7/20/2011 7:35 AM, Albert ARIBAUD wrote:
> > Le 20/07/2011 16:01, J. William Campbell a écrit :
> >> On 7/20/2011 6:02 AM, Albert ARIBAUD wrote:
> >>> Le 19/07/2011 22:11, J. William Campbell a écrit :
> >>>
> >>>> If this is true, then it means that the cache is of type write-back
> >>>> (as
> >>>> opposed to write-thru). From a (very brief) look at the arm7
> >>>> manuals, it
> >>>> appears that both types of cache may be present in the cpu. Do you
> >>>> know
> >>>> how this operates?
> >>> Usually, copyback (rather than writeback) and writethough are modes of
> >>> operation, not cache types.
> >> Hi Albert,
> >> One some CPUs both cache modes are available. On many other CPUs (I
> >> would guess most), you have one fixed mode available, but not both. I
> >> have always seen the two modes described as write-back and
> >> write-through, but I am sure we are talking about the same things.
> >
> > We are. Copy-back is another name for write-back, not used by ARM but
> > by some others.
> >
> >> The
> >> examples that have both modes that I am familiar with have the mode as a
> >> "global" setting. It is not controlled by bits in the TLB or anything
> >> like that. How does it work on ARM? Is it fixed, globally, globally
> >> controlled, or controlled by memory management?
> >
> > Well, it's a bit complicated, because it depends on the architecture
> > version *and* implementation -- ARM themselves do not mandate things,
> > and it is up to the SoC designer to specify what cache they want and
> > what mode it supports, both at L1 and L2, in their specific instance
> > of ARM cores. And yes, you can have memory areas that are write-back
> > and others that are write-through in the same system.
> >
> >> If it is controlled by memory management, it looks to me like lots of
> >> problems could be avoided by operating with input type buffers set as
> >> write-through. One probably isn't going to be writing to input buffers
> >> much under program control anyway, so the performance loss should be
> >> minimal. This gets rid of the alignment restrictions on these buffers
> >> but not the invalidate/flush requirements.
> >
> > There's not much you can do about alignment issues except align to
> > cache line boundaries.
> >
> >> However, if memory management
> >> is required to set the cache mode, it might be best to operate with the
> >> buffers and descriptors un-cached. That gets rid of the flush/invalidate
> >> requirement at the expense of slowing down copying from read buffers.
> >
> > That makes 'best' a subjective choice, doesn't it? :)
> Hi All,
> Yes,it probably depends on the usage.
> >
> >> Probably a reasonable price to pay for the associated simplicity.
> >
> > Others would say that spending some time setting up alignments and
> > flushes and invalidates is a reasonable price to pay for increased
> > performance... That's an open debate where no solution is The Right
> > One(tm).
> >
> > For instance, consider the TFTP image reading. People would like the
> > image to end up in cached memory because we'll do some checksumming on
> > it before we give it control, and having it cached makes this step
> > quite faster; but we'll lose that if we put it in non-cached memory
> > because it comes through the Ethernet controller's DMA; and it would
> > be worse to receive packets in non-cached memory only to move their
> > contents into cached memory later on.
> >
> > I think properly aligning descriptors and buffers is enough to avoid
> > the mixed flush/invalidate line issue, and wisely putting instruction
> > barriers should be enough to get the added performance of cache
> > without too much of the hassle of memory management.
> I am pretty sure that all the drivers read the input data into
> intermediate buffers in all cases. There is no practical way to be sure
> the next packet received is the "right one" for the tftp. Plus there are
> headers involved, and furthermore there is no way to ensure that a tftp
> destination is located on a sector boundary. In short, you are going to
> copy from an input buffer to a destination.
> However, it is still correct that copying from an non-cached area is
> slower than from cached areas, because of burst reads vs. individual
> reads. However, I doubt that the u-boot user can tell the difference, as
> the network latency will far exceed the difference in copy time. The
> question is, which is easier to do, and that is probably a matter of
> opinion. However, it is safe to say that so far a cached solution has
> eluded us. That may be changing, but it would still be nice to know how
> to allocate a section of un-cached RAM in the ARM processor, in so far
> as the question has a single answer! That would allow easy portability
> of drivers that do not know about caches, of which there seems to be many.
I agree. Unfortunately, my time is up for now, and I can't go on with trying
to fix this driver. Maybe I'll pick up after my vacation.
As for now I settled for the ugly solution of keeping dcache disabled while
ethernet is being used :-(
IMHO, doing cache maintenance all over the driver is not an easy or nice
solution. Implementing a non-cached memory pool in the MMU and a corresponding
dma_malloc() sounds like much more universally applicable to any driver.
Best regards,
--
David Jander
Protonic Holland.
More information about the U-Boot
mailing list