[U-Boot] [PATCH 1/2] mmc: dw_mmc: Increase timeout to 20 seconds

Marek Vasut marex at denx.de
Mon Sep 14 15:36:17 CEST 2015


On Monday, September 14, 2015 at 01:22:20 PM, Lukasz Majewski wrote:
> Hi Alexey,
> 
> > Hi Marek, Lukasz,
> > 
> > On Sun, 2015-09-13 at 16:00 +0200, Marek Vasut wrote:
> > > On Sunday, September 13, 2015 at 12:03:18 PM, Lukasz Majewski wrote:
> > > > Hi Marek,
> > > 
> > > Hi,
> > > 
> > > [...]
> > > 
> > > > > > > Still we need to fix regression first with virtually
> > > > > > > infinite timeout :) I would even thing that simple revert
> > > > > > > of Marek's patch may make sense for now.
> > > > > > 
> > > > > > +1 - unfortunately there were some other patches applied to
> > > > > > this particular patch. Simple revert might be a bit tricky
> > > > > > here.
> > > > > 
> > > > > -1 - In case the card gets removed during the DMA transfer and
> > > > > the board doesn't have a watchdog, it will get stuck
> > > > > indefinitelly.
> > > > 
> > > > I'm just wondering here - why the indefinite loop was working
> > > > previously? Was anybody complaining (on the ML) about the problem
> > > > of removing the SD card when some operation is ongoing?
> > > 
> > > It worked for me for all the workloads I used. Noone was
> > > complaining.
> > 
> > The same story here - previous code with infinite loop was working for
> > my boards. And now I do see a problem with pretty simple scenario that
> > we do use in our products.
> > 
> > > > The problem with potential removal of SD card (after booting the
> > > > board) is with us for quite long time. Even with indefinite loop
> > > > (without your patch) we also could "hang" the board if the SD
> > > > card was removed during a transfer.
> > > 
> > > Which is why we should weed out the unbounded loops.
> > > 
> > > > > We
> > > > > absolutelly don't want this sort of behavior in U-Boot. I
> > > > > understand that this is the easiest way for everyone to achieve
> > > > > some sort of "working" solution, but it is definitelly not the
> > > > > correct one. While I do agree to increasing the timeout, I do
> > > > > not agree to unbounded loops, sorry.
> > > > 
> > > > We have agreed to not agree :-)
> > > 
> > > Yes :-)
> > 
> > The first thing I care is working U-Boot v2015.10 out of the box on my
> > boards. And so I may agree on any temporary solution. I see it as
> > timeout value either being infinite or obviously very high like 60
> > seconds.
> > 
> > 60 seconds might sound stupid but my thought behind this is to make
> > sure even long transfers succeed. Imagine 100 Mb rootfs or update
> > file downloaded from slow SD-card.
> 
> Transfer of rootfs to SD-card (downloaded to memory via tftp) is
> definitely valid scenario.
> 
> > > > > > > From both points of view for keeping history
> > > > > > > clean (compared to stacked fixes/workarounds) and from
> > > > > > > removal of regression root cause.
> > > > > > 
> > > > > > As I said before - +1 from me.
> > > > > 
> > > > > As I said before, -1 from me. Btw. did anything regress in
> > > > > here? To me, this seems like a newly discovered bug ...
> > > > 
> > > > Yes, this is a bug. We had similar problem with Samsung's SDHCI,
> > > > before we switched to dw_mmc. This issue is new at dw_mmc.
> > > > 
> > > > > > > It's not that I like to have infinite loops but given
> > > > > > > previous implementation worked fine for people in the
> > > > > > > previous U-Boot release.
> > > > > > 
> > > > > > Good justification
> > > > > 
> > > > > It is never a justified to return to a potentially problematic
> > > > > version
> > > > 
> > > > IMHO revering the change (before the release) is from the software
> > > > development point of view better solution than adding some
> > > > heuristic delta to timeout.
> > > > 
> > > > > for the sake of getting some sort of crappy hardware
> > > > > operational.
> > > > 
> > > > Unfortunately this "crappy hardware" is pervasive and we cannot do
> > > > anything about it.
> > > > 
> > > > To sum up (my point of view):
> > > > 1. The best would be to revert the patch - but if simple "git
> > > > revert" is not working then,
> > 
> > Well even if clean revert won't work we may do manual tweaks so that
> > functionally it is "revert". If of any interest I may come up with
> > that sort of patch.
> > 
> > > > 2. We should increase the timeout (with my patch) for v2015.10
> > > > release
> > 
> > If everybody is OK with that let's go do it. Because release is
> > around the corner and I don't want to explain each and every user how
> > to fix their new problem.
> 
> v2015.10 correctness is crucial here.
> 
> > > Let's do this for the sake of crappy cards.
> > > 
> > > > 3. After release we can devise some kind of solution
> > > > 4. It is a good topic for U-boot's minisummit discussion at Dublin
> > > > 
> > > > Marek, Alexey, Tom, Pantelis what do you think?
> > > 
> > > I think yes.
> > 
> > What's important we need to make sure Tom is aware of this situation
> > and he won't cut a release until our fix is in place and all involved
> > parties reported their happiness.
> 
> I also think that Tom should speak up on this issue.

btw. I think I know why SoCFPGA never triggered this issue, it has
CONFIG_SYS_MMC_MAX_BLK_COUNT set to 256 so a large transfer was
always broken to smaller chunks.


More information about the U-Boot mailing list