[U-Boot] [PATCH 1/2] mmc: dw_mmc: Increase timeout to 20 seconds

Lukasz Majewski l.majewski at samsung.com
Fri Sep 18 09:32:47 CEST 2015


Hi Tom,

> On Thu, Sep 17, 2015 at 04:43:33PM +0200, Lukasz Majewski wrote:
> 
> > Hi Tom,
> > 
> > > On Monday, September 14, 2015 at 01:22:20 PM, Lukasz Majewski
> > > wrote:
> > > > Hi Alexey,
> > > > 
> > > > > Hi Marek, Lukasz,
> > > > > 
> > > > > On Sun, 2015-09-13 at 16:00 +0200, Marek Vasut wrote:
> > > > > > On Sunday, September 13, 2015 at 12:03:18 PM, Lukasz
> > > > > > Majewski wrote:
> > > > > > > Hi Marek,
> > > > > > 
> > > > > > Hi,
> > > > > > 
> > > > > > [...]
> > > > > > 
> > > > > > > > > > Still we need to fix regression first with virtually
> > > > > > > > > > infinite timeout :) I would even thing that simple
> > > > > > > > > > revert of Marek's patch may make sense for now.
> > > > > > > > > 
> > > > > > > > > +1 - unfortunately there were some other patches
> > > > > > > > > applied to this particular patch. Simple revert might
> > > > > > > > > be a bit tricky here.
> > > > > > > > 
> > > > > > > > -1 - In case the card gets removed during the DMA
> > > > > > > > transfer and the board doesn't have a watchdog, it will
> > > > > > > > get stuck indefinitelly.
> > > > > > > 
> > > > > > > I'm just wondering here - why the indefinite loop was
> > > > > > > working previously? Was anybody complaining (on the ML)
> > > > > > > about the problem of removing the SD card when some
> > > > > > > operation is ongoing?
> > > > > > 
> > > > > > It worked for me for all the workloads I used. Noone was
> > > > > > complaining.
> > > > > 
> > > > > The same story here - previous code with infinite loop was
> > > > > working for my boards. And now I do see a problem with pretty
> > > > > simple scenario that we do use in our products.
> > > > > 
> > > > > > > The problem with potential removal of SD card (after
> > > > > > > booting the board) is with us for quite long time. Even
> > > > > > > with indefinite loop (without your patch) we also could
> > > > > > > "hang" the board if the SD card was removed during a
> > > > > > > transfer.
> > > > > > 
> > > > > > Which is why we should weed out the unbounded loops.
> > > > > > 
> > > > > > > > We
> > > > > > > > absolutelly don't want this sort of behavior in U-Boot.
> > > > > > > > I understand that this is the easiest way for everyone
> > > > > > > > to achieve some sort of "working" solution, but it is
> > > > > > > > definitelly not the correct one. While I do agree to
> > > > > > > > increasing the timeout, I do not agree to unbounded
> > > > > > > > loops, sorry.
> > > > > > > 
> > > > > > > We have agreed to not agree :-)
> > > > > > 
> > > > > > Yes :-)
> > > > > 
> > > > > The first thing I care is working U-Boot v2015.10 out of the
> > > > > box on my boards. And so I may agree on any temporary
> > > > > solution. I see it as timeout value either being infinite or
> > > > > obviously very high like 60 seconds.
> > > > > 
> > > > > 60 seconds might sound stupid but my thought behind this is to
> > > > > make sure even long transfers succeed. Imagine 100 Mb rootfs
> > > > > or update file downloaded from slow SD-card.
> > > > 
> > > > Transfer of rootfs to SD-card (downloaded to memory via tftp) is
> > > > definitely valid scenario.
> > > > 
> > > > > > > > > > From both points of view for keeping history
> > > > > > > > > > clean (compared to stacked fixes/workarounds) and
> > > > > > > > > > from removal of regression root cause.
> > > > > > > > > 
> > > > > > > > > As I said before - +1 from me.
> > > > > > > > 
> > > > > > > > As I said before, -1 from me. Btw. did anything regress
> > > > > > > > in here? To me, this seems like a newly discovered
> > > > > > > > bug ...
> > > > > > > 
> > > > > > > Yes, this is a bug. We had similar problem with Samsung's
> > > > > > > SDHCI, before we switched to dw_mmc. This issue is new at
> > > > > > > dw_mmc.
> > > > > > > 
> > > > > > > > > > It's not that I like to have infinite loops but
> > > > > > > > > > given previous implementation worked fine for
> > > > > > > > > > people in the previous U-Boot release.
> > > > > > > > > 
> > > > > > > > > Good justification
> > > > > > > > 
> > > > > > > > It is never a justified to return to a potentially
> > > > > > > > problematic version
> > > > > > > 
> > > > > > > IMHO revering the change (before the release) is from the
> > > > > > > software development point of view better solution than
> > > > > > > adding some heuristic delta to timeout.
> > > > > > > 
> > > > > > > > for the sake of getting some sort of crappy hardware
> > > > > > > > operational.
> > > > > > > 
> > > > > > > Unfortunately this "crappy hardware" is pervasive and we
> > > > > > > cannot do anything about it.
> > > > > > > 
> > > > > > > To sum up (my point of view):
> > > > > > > 1. The best would be to revert the patch - but if simple
> > > > > > > "git revert" is not working then,
> > > > > 
> > > > > Well even if clean revert won't work we may do manual tweaks
> > > > > so that functionally it is "revert". If of any interest I may
> > > > > come up with that sort of patch.
> > > > > 
> > > > > > > 2. We should increase the timeout (with my patch) for
> > > > > > > v2015.10 release
> > > > > 
> > > > > If everybody is OK with that let's go do it. Because release
> > > > > is around the corner and I don't want to explain each and
> > > > > every user how to fix their new problem.
> > > > 
> > > > v2015.10 correctness is crucial here.
> 
> Yes.
> 
> > > > > > Let's do this for the sake of crappy cards.
> > > > > > 
> > > > > > > 3. After release we can devise some kind of solution
> > > > > > > 4. It is a good topic for U-boot's minisummit discussion
> > > > > > > at Dublin
> > > > > > > 
> > > > > > > Marek, Alexey, Tom, Pantelis what do you think?
> > > > > > 
> > > > > > I think yes.
> > > > > 
> > > > > What's important we need to make sure Tom is aware of this
> > > > > situation and he won't cut a release until our fix is in place
> > > > > and all involved parties reported their happiness.
> > > > 
> > 
> > 
> > > > I also think that Tom should speak up on this issue.
> > 
> > Tom, could you give your opinion on this issue?
> 
> Well, is there a concensus patch now?
> 

There isn't a consensus patch.

There are two options:
1. Try to revert changes (which remove endless loop)

2. Increase the delay to e.g. 4 minutes (as it is done in Linux) before
v2015.10 and provide correct solution (based on internal SD card
information) after the release.


-- 
Best regards,

Lukasz Majewski

Samsung R&D Institute Poland (SRPOL) | Linux Platform Group


More information about the U-Boot mailing list