[U-Boot-Users] MPC83xx data cache lock?

Liu Dave-r63238 DaveLiu at freescale.com
Wed May 24 11:48:33 CEST 2006


> -----Original Message-----
> Just measue the time it takes to initialize ECC memory  
> either  using the  cache  or DMA methods; here is a short 
> summary (don't complain - you asked for it!):
> 
> ----- quote begin -----
> 
> 1. Read vs. write performance
> 
> Writing to DDR memory is *much* slower than reading it.
> 
> ECC off
> read  duration: 509 ms
> write duration: 1546 ms
> 
> ECC on
> read  duration: 509 ms
> write duration: 5703 ms
> 
I have a test, the read vs. write performance is

ECC off
read duration: 4124 ms
write duration: 1516 ms

ECC on
read duration: 4634 ms
write duration: 5703 ms

Because data cache is locked all of ways, so the data cache's
behavior looks like cache inhibited, we access memory with
the two instructions, stw for 32bits write and lwz---for 32bits read.

The write performance is the same to you, but read performance
is very different between us.

I don't know how did you do the read access memory?

If you only read from memory to one variable, and you don't reference
this variable later, the compiler will remove the load instruction to
optimize. Or you define the variable with volatile type.

I suggest you check the assembler code to make sure the load
instruction in the loop and no any other memory access instructions
in the loop. 

When the ECC enable, the write duration is 4x difference when
the ECC is off, I think sub-double word write cause read-modify-write
bus operation. It will consume more time do the write access.

Why the read time is triple than the write time in my test?
I will address this.

> There's no clear indication in both DDR (8349) docs and 
> Micron specification of our module on if and how read vs. 
> write operations differ in timing. There is one pointer for 
> the ECC case, which suggests writes can take three stages 
> (full read-modify-write cycle) instead of just one:
> 
> "9.5.4 SDRAM Interface Timing - If ECC is disabled, writes 
> smaller than 
> double words are performed by appropriately activating the 
> data mask. If 
> ECC is enabled, the controller performs a read-modify write."
> 
> The problem is we see 3x difference when the ECC is off, and 10x when 
> on. We also did a series of tests with various chunk sizes of data 
> written, so as to be sure we do not do the indicated sub-double word 
> writes, but the results were the same.
> 
Do you make sure you do not do the sub-double word writes?

I also do one 64 bits read / write access test for full memory space. 

Access memory with dobule precision float load/store instructions.
Lfd for 64 bits read and stfd for 64 bits write.

The code see the attatchment. And the result is

ECC off
read duration: 2317 ms
write duration: 774 ms

ECC on
read duration: 2317 ms
write duration: 774 ms

When ECC is on, we do double word write operation, so RMW
cycles don't happen.


> This is really strange, although at least read operations are not 
> affected by enabling ECC (which is according to the book - 
> there should 
> be minimal overhead put on read operations while ECC on, see 
> 3. below).
> 
> 2. DMA (low) performance
> 
> Using DMA for transfers proves very inefficient. As mentioned 
> earlier, 
> the DMA module in 8349 is different than seen in other 
> families, and it 
> occured to us a bit "alien" when compared with the rest of 
> the chip (DMA 
> documentation part is rather limited, and different in style 
> etc.), as 
> if taken from elsewhere. It is also peculiar in technical aspects: 
> endianness used is different, so we need to convert the order 
> explicitly 
> in s/w.
> 
> We tried increasing the local bus clocking but to no avail.
> 
Local bus clock don't effect to CSB and DDR performance.

> Given that low performance it doesn't make much difference 
> whether ECC 
> is enabled or not:
> 
> DMA, ECC on
> ddr init duration: 6947 ms
> 
> DMA, ECC off
> ddr init duration: 6721 ms
>
 
My test data is:

DMA, ECC on
ddr init duration: 6945 ms

DMA, ECC off
ddr init duration: 6558 ms

Just little difference to you.

> There seems something broken with the DMA operations in 
> general as they 
> are way slower than just plain read/write to memory, which is somehow 
> confirmed by your recent communication from the customer.
>
Init all of memory with DMA method as u-boot code,
DMA controller will do ----read from memory  and do ----write to memory.
and loop it.

This will arise lot of read access from memory. Consume more time. 
 
> 
> 3. ECC penalty
> 
> As can be seen in results given in 1. enabling ECC puts a 
> huge burden on 
> write access, which is contrary to 8349 UM:
> 
> p. 9-27 (above figure 9-24) "When ECC is enabled, one clock cycle is 
> added to the read path to check ECC and correct single-bit 
> errors.  ECC 
> generation does not add a cycle to the write path."
> 
> ----- quote begin -----
> 
> 
> Can you explain why writing to ECC memory is  10  times  
> slower  than reading?
> 
I hope you can tell me how did you mesure the read time. Thanks.


Regards,
Dave 





More information about the U-Boot mailing list