[U-Boot] [PATCH 1/1] core/uclass: uclass_get_device_tail: always set devp

Simon Glass sjg at chromium.org
Wed Apr 19 03:37:44 UTC 2017


Hi,

On 18 April 2017 at 21:03, Andreas Färber <afaerber at suse.de> wrote:
> Hi Simon,
>
> Am 19.04.2017 um 02:12 schrieb Simon Glass:
>> On 18 April 2017 at 12:44, Heinrich Schuchardt <xypron.glpk at gmx.de> wrote:
>>> Set devp even if probing fails.
>>>
>>> Without the patch the following problem occurs:
>>> If the first block device is not probed successfully no block
>>> device is passed by bootefi to the EFI executable.
>>>
>>> The problem was reported by Andreas Färber in
>>> https://lists.denx.de/pipermail/u-boot/2017-April/287432.html
>>>
>>> For testing I used an odroid-c2 with a dts including
>>> &sd_emmc_a {
>>>         status = "okay";
>>> }
>>> This device does not exist on the board and cannot be initialized.
>>
>> Thanks for finding this. Which function is calling this? Can you
>> please explain the call sequence to hit this problem? I am worried
>> that something else is wrong.
>
> It is lib/efi_loader/efi_disk.c:efi_disk_register() that is calling
> uclass_first_device() and uclass_next_device(). Based on the output we
> had concluded that uclass_first_device() fails already - therefore Alex
> CC'ed you.
>
> Unfortunately I find the function interactions inside uclass.c rather
> complex and unintuitive, so I am truely amazed at Heinrich's findings.
> (Passing the ret code from one function to the next just to return?!)
>
> Based on his patch we can assume that uclass_find_first_device() set dev
> to a non-NULL value, otherwise we wouldn't end up in
> uclass_get_device_tail().
>
> So if you're saying this patch is wrong, then that would leave
> device_probe(). The actual meson_mmc_probe() function you had given a
> late Reviewed-by, and it always returns 0.
>
> Jaehoon had asked about cfg->voltages in meson_mmc_probe(), but I don't
> see how that would lead to probe failure here? Whether the values are
> correct I have no idea.
>
> The error message "mmc_init: -95, time 1807", which appears to be a
> result of this iteration/probe process, comes from
> drivers/mmc/mmc.c:mmc_init(), which is called from
> mmc-uclass.c:mmc_blk_probe(). mmc_init() calls mmc_start_init(), which
> returns this -EOPNOTSUPP if sd_send_op_cond return -ETIMEDOUT and
> mmc_send_op_cond() failed. However, our output does not show "Card did
> not respond to voltage select!", which it should in that case. So the
> error must be coming from elsewhere. sd_send_op_cond() may return
> -EOPNOTSUPP through mmc_start_init(), and so does mmc_complete_op_cond()
> through mmc_complete_init(). I would expect either to fail, as in my
> case it's an SDIO, and in Heinrich's case it's not connected to
> anything. So I don't think MMC is at fault here.
>
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/arch/arm64/boot/dts/amlogic/meson-gxbb-vega-s95.dtsi
>
> See sd_mmc_a, which uses an mmc-pwrseq that depends on a pwm-clock,
> neither of which have any support in U-Boot, nor is it actually needed
> for booting.
>
> In any case, failure to probe one device should not break the iteration
> of other devices. I.e., if a device fails to probe then instead of
> returning from uclass_{first,next}_device() we should look at the next
> device until we find one that's okay or have reached the end of the
> list. How to best implement that in uclass.c I'm less sure of.
>
> If I am right you could test this on any board with multiple, e.g., blk
> devices where a first or non-last device returns an error code from its
> probe, i.e. try changing return 0 in some probe function based on a
> static variable not yet being initialized or something. Probably you
> could even write some tiny sandbox based test, without actual hardware.
>
> IIUC this patch changes behavior from iterating over no devices to
> iterating over all devices including ones not probed successfully? And
> what you intended was to instead iterate over only the probed devices?

I think this is a genuine bug, but exactly where is less obvious.

The uclass function don't return a device if they get an error. So at
present you need to iterate through the uclass if the intention is to
continue after error. That would be uclass_foreach_dev(). But before
using the device, device_probe() needs to be called since the
iteration does not probe it automatically. That is why people normally
use uclass_first/next_device(), since it probes as it goes.

However in this case I think it is reasonable to change
uclass_first/next_device() to always return the device, even on error.
That can best be done by changing those functions rather than
uclass_get_device_tail(), which is shared by many functions. We should
not change uclass_get_device_tail() since what is does is correct for
all other uses (I think).

In that case, the function comments for the first/next functions
should be updated to indicate the new behaviour, and a test created,
perhaps in test/dm/core.c.

You were asking about uclass_get_device_tail(). This is common code
and is put into a function to ensure consistent behaviour across all
functions that return an error code and a device. Consistency is very
important with driver model since things can get version confusing
when something odd happens in the bowels of the system, as this bug
has shown.

Regards,
Simon


More information about the U-Boot mailing list