[U-Boot] [PATCH 1/2] mmc: dw_mmc: Increase timeout to 20 seconds

Lukasz Majewski l.majewski at samsung.com
Thu Sep 17 16:43:33 CEST 2015


Hi Tom,

> On Monday, September 14, 2015 at 01:22:20 PM, Lukasz Majewski wrote:
> > Hi Alexey,
> > 
> > > Hi Marek, Lukasz,
> > > 
> > > On Sun, 2015-09-13 at 16:00 +0200, Marek Vasut wrote:
> > > > On Sunday, September 13, 2015 at 12:03:18 PM, Lukasz Majewski
> > > > wrote:
> > > > > Hi Marek,
> > > > 
> > > > Hi,
> > > > 
> > > > [...]
> > > > 
> > > > > > > > Still we need to fix regression first with virtually
> > > > > > > > infinite timeout :) I would even thing that simple
> > > > > > > > revert of Marek's patch may make sense for now.
> > > > > > > 
> > > > > > > +1 - unfortunately there were some other patches applied
> > > > > > > to this particular patch. Simple revert might be a bit
> > > > > > > tricky here.
> > > > > > 
> > > > > > -1 - In case the card gets removed during the DMA transfer
> > > > > > and the board doesn't have a watchdog, it will get stuck
> > > > > > indefinitelly.
> > > > > 
> > > > > I'm just wondering here - why the indefinite loop was working
> > > > > previously? Was anybody complaining (on the ML) about the
> > > > > problem of removing the SD card when some operation is
> > > > > ongoing?
> > > > 
> > > > It worked for me for all the workloads I used. Noone was
> > > > complaining.
> > > 
> > > The same story here - previous code with infinite loop was
> > > working for my boards. And now I do see a problem with pretty
> > > simple scenario that we do use in our products.
> > > 
> > > > > The problem with potential removal of SD card (after booting
> > > > > the board) is with us for quite long time. Even with
> > > > > indefinite loop (without your patch) we also could "hang" the
> > > > > board if the SD card was removed during a transfer.
> > > > 
> > > > Which is why we should weed out the unbounded loops.
> > > > 
> > > > > > We
> > > > > > absolutelly don't want this sort of behavior in U-Boot. I
> > > > > > understand that this is the easiest way for everyone to
> > > > > > achieve some sort of "working" solution, but it is
> > > > > > definitelly not the correct one. While I do agree to
> > > > > > increasing the timeout, I do not agree to unbounded loops,
> > > > > > sorry.
> > > > > 
> > > > > We have agreed to not agree :-)
> > > > 
> > > > Yes :-)
> > > 
> > > The first thing I care is working U-Boot v2015.10 out of the box
> > > on my boards. And so I may agree on any temporary solution. I see
> > > it as timeout value either being infinite or obviously very high
> > > like 60 seconds.
> > > 
> > > 60 seconds might sound stupid but my thought behind this is to
> > > make sure even long transfers succeed. Imagine 100 Mb rootfs or
> > > update file downloaded from slow SD-card.
> > 
> > Transfer of rootfs to SD-card (downloaded to memory via tftp) is
> > definitely valid scenario.
> > 
> > > > > > > > From both points of view for keeping history
> > > > > > > > clean (compared to stacked fixes/workarounds) and from
> > > > > > > > removal of regression root cause.
> > > > > > > 
> > > > > > > As I said before - +1 from me.
> > > > > > 
> > > > > > As I said before, -1 from me. Btw. did anything regress in
> > > > > > here? To me, this seems like a newly discovered bug ...
> > > > > 
> > > > > Yes, this is a bug. We had similar problem with Samsung's
> > > > > SDHCI, before we switched to dw_mmc. This issue is new at
> > > > > dw_mmc.
> > > > > 
> > > > > > > > It's not that I like to have infinite loops but given
> > > > > > > > previous implementation worked fine for people in the
> > > > > > > > previous U-Boot release.
> > > > > > > 
> > > > > > > Good justification
> > > > > > 
> > > > > > It is never a justified to return to a potentially
> > > > > > problematic version
> > > > > 
> > > > > IMHO revering the change (before the release) is from the
> > > > > software development point of view better solution than
> > > > > adding some heuristic delta to timeout.
> > > > > 
> > > > > > for the sake of getting some sort of crappy hardware
> > > > > > operational.
> > > > > 
> > > > > Unfortunately this "crappy hardware" is pervasive and we
> > > > > cannot do anything about it.
> > > > > 
> > > > > To sum up (my point of view):
> > > > > 1. The best would be to revert the patch - but if simple "git
> > > > > revert" is not working then,
> > > 
> > > Well even if clean revert won't work we may do manual tweaks so
> > > that functionally it is "revert". If of any interest I may come
> > > up with that sort of patch.
> > > 
> > > > > 2. We should increase the timeout (with my patch) for v2015.10
> > > > > release
> > > 
> > > If everybody is OK with that let's go do it. Because release is
> > > around the corner and I don't want to explain each and every user
> > > how to fix their new problem.
> > 
> > v2015.10 correctness is crucial here.
> > 
> > > > Let's do this for the sake of crappy cards.
> > > > 
> > > > > 3. After release we can devise some kind of solution
> > > > > 4. It is a good topic for U-boot's minisummit discussion at
> > > > > Dublin
> > > > > 
> > > > > Marek, Alexey, Tom, Pantelis what do you think?
> > > > 
> > > > I think yes.
> > > 
> > > What's important we need to make sure Tom is aware of this
> > > situation and he won't cut a release until our fix is in place
> > > and all involved parties reported their happiness.
> > 


> > I also think that Tom should speak up on this issue.

Tom, could you give your opinion on this issue?


> 
> btw. I think I know why SoCFPGA never triggered this issue, it has
> CONFIG_SYS_MMC_MAX_BLK_COUNT set to 256 so a large transfer was
> always broken to smaller chunks.



-- 
Best regards,

Lukasz Majewski

Samsung R&D Institute Poland (SRPOL) | Linux Platform Group


More information about the U-Boot mailing list