[U-Boot] [PATCH 1/2] mmc: dw_mmc: Increase timeout to 20 seconds

Lukasz Majewski l.majewski at samsung.com
Tue Sep 1 13:19:09 CEST 2015


Hi Marek,

> On Saturday, August 29, 2015 at 06:38:48 PM, Lukasz Majewski wrote:
> > Hi Marek,
> 
> Hi Lukasz,
> 
> > > On Saturday, August 29, 2015 at 01:55:36 PM, Lukasz Majewski
> > > wrote:
> > > > On Fri, 28 Aug 2015 23:55:17 +0200
> > > 
> > > Hi!
> > > 
> > > > Marek Vasut <marex at denx.de> wrote:
> > > > > On Friday, August 28, 2015 at 03:50:20 PM, Lukasz Majewski
> > > > > wrote:
> > > > > > The commit: d9dbb97be0e4a550457aec5f11afefb446169c90
> > > > > > "mmc: dw_mmc: Zap endless timeout" removed endless loop
> > > > > > waiting for end of dw mmc transfer.
> > > > > > 
> > > > > > For some workloads - dfu test @ Odroid XU3 (sending 8MiB
> > > > > > file) - and SD cards (e.g. MicroSD Kingston 4GiB, Adata
> > > > > > 4GiB) the default timeout is to short.
> > > > > > 
> > > > > > The new value - 20 seconds - takes into account the
> > > > > > situation when SD card triggers internal clean up. Such
> > > > > > process may take more than 10 seconds on some cards.
> > > > > 
> > > > > What happens if you pull the SD card out of the slot during
> > > > > such a process?
> > > > 
> > > > Then we would wait 20 seconds :-) to proceed.
> > > 
> > > Oops, I think I was not clear here. I was wondering what happens
> > > to the card if you yank it out of the slot whole it's performing
> > > it's internal cleanup or whatever. Is it possible that the card
> > > suffers data corruption, effectively trashing user data ?
> > 
> > I think that only the card manufacturer may reveal what can happen
> > with the card (what policy have been implemented in their FW).
> > 
> > I guess that you may lost data in such case.
> 
> Uuuurgh, that's seriously shitty.
> 
> > > Is this behavior
> > > specific to Samsung SD cards ?
> > 
> > I've experienced the problem with Kingston (brand new one) and Adata
> > MicroSD HC (4GiB) cards.
> 
> I had bad previous experience with Kingston too.
> 
> > > > To be clear - the mentioned patch introduced regression.
> > > 
> > > That's a bug, not a regression, but anyway,
> > > that's not the point. I do
> > > agree with you that we do have a problem and I'm inclined to Ack
> > > this patch, I'd like to understand what the real implications of
> > > such a behavior of these cards are.
> > > 
> > > > It works for
> > > > small files on a commonly available SD cards (like 4 GiB
> > > > Kingston/Adata).
> > > > 
> > > > When I ran DFU tests I've discovered that there is a problem
> > > > with storing 8MiB file (dat_8M.img).
> > > > 
> > > > Even worse - when one wants to store Image.itb file (which
> > > > might be 4-6 MiB) it sometimes works and sometimes not.
> > > > Nightmare for debugging.
> > > 
> > > Ew, that's one crappy card you have there. I'm reading the SD card
> > > "Physical Layer Simplified Specification Version
> > > 4.10" (part1_410.pdf) section 4.6.2.2 and it states that for SDHC
> > > cards, the write operation should take at most 250mS, for SDXC
> > > it's 500mS. Could it be that your card is violating the spec ?

The "timeout" error is for situation when you issue write command
(either single or multiple block) and you don't receive any response
from the card.

In our case we use multiblock transfer (CMD25) with either set number
of block to transfer (CMD23) or explicit end of transmission (CMD12).

Let's consider the second case.

We setup data and issue CMD25. Then we check the CMD25 status (if we
don't receive reply in 250 ms we would get timeout error).
Afterwards data from buffer is transmitted to the card and flashed.
This operation may take long time. During this process you can issue
CMD13 (SEND_STATUS) to receive information about card state ([*] 4.10. -
page 85).

Two notable fields of [*] to check are READY_FOR_DATA and CURRENT_STATE.
Those two state what is the SD Card controller situation.

Then, you end the transfer with CMD12, which also provides some status
information from the SDCard (like "prg"|"rcv", etc).

If you want to issue next command you must check READY_FOR_DATA and
CURRENT_STATE. In the case of internal SD card controller operation you
would not get READY_FOR_DATA until it ends.


> > 
> > I'll look into the spec and then comment :-).
> > 
> > For my boards the time was 1.2 seconds for storing 8 MiB file.
> 
> OK, but that's weird. The transfer should always be up to 512B long
> and not any longer, unless it's a multiblock transfer.

It is a multi block transfer.

> 
> > > > Please correct me if I'm wrong - but is seems like we are now
> > > > using 1 second timeout for detection if SD card has been
> > > > removed?
> > > > 
> > > > Shouldn't we use polling on the card detect IO pin instead? We
> > > > could add such polling in several places in the MMC subsystem
> > > > (like we do it with watchdog).
> > > > 
> > > > Marek, Pantelis, what do you think about this?
> > > 
> > > If you implement board_mmc_getcd(), you can check if the card is
> > > present this way instead of waiting for command to time out. The
> > > infrastructure for that is already in place. Right ?
> > 
> > So you suggest adding board_mmc_getcd() in several places in the mmc
> > subsystem driver to detect removal of the SD card?
> 
> Hmmmm, I'm not sure about this one. Panto ?
> 
> > > It'd be cool if the MMC subsystem could pull the wp-gpios and
> > > cd-gpios from DT though :)
> > 
> > +1
> > 
> > > > > Also, where did you find out there is such "cleanup" mechanism
> > > > > please ?
> > > > 
> > > > Internally we did some tests with several SD cards. We were
> > > > stunned when it turned out that for some workloads it took up
> > > > to 15 seconds to end write operation for small data.
> > > > 
> > > > The culprit is the SD Card embedded controller responsible for
> > > > FTL - flash translation layer.
> > > > It allows NAND memory on the card to be visible as the block
> > > > device. More importantly it also takes care of wear leveling
> > > > and bad block management.
> > > > 
> > > > Hence, we don't know when it would start housekeeping
> > > > operations. We can only poll/wait until this controller
> > > > finishes it work. The code as it was (with the indefinite loop)
> > > > was taking this situation into account.
> > > > 
> > > > The 1 second timeout is apparently too short and makes using SD
> > > > card non-deterministic and error prone in u-boot.
> > > > 
> > > > Even worse, many devices use SD card as the only storage device.
> > > 
> > > Yes, horrible.
> > 
> > Good that we have agreed.
> 
> Heh :)
> 
> Best regards,
> Marek Vasut



-- 
Best regards,

Lukasz Majewski

Samsung R&D Institute Poland (SRPOL) | Linux Platform Group


More information about the U-Boot mailing list