[U-Boot] [PATCH] w1: fix occasional enumeration failure

Martin Fuzzey martin.fuzzey at flowbird.group
Tue Nov 27 10:58:22 UTC 2018


Hi Eugen,

On 27/11/2018 10:47, Eugen.Hristev at microchip.com wrote:
>
> On 23.11.2018 11:53, Martin Fuzzey wrote:
>> Sometimes enumeration fails (about 1 in 50 times on my custom board).
>>
>> The underlying reason is probably electrical but Linux does not have
>> the problem.
>>
>> Comparing the Linux / u-boot implementations shows that Linux
>> retries the error case whereas u-boot aborts early.
>>
>> Removing the early abort in u-boot fixes the problem.
>>
>> Signed-off-by: Martin Fuzzey <martin.fuzzey at flowbird.group>
>> ---
>>    drivers/w1/w1-uclass.c | 4 ----
>>    1 file changed, 4 deletions(-)
>>
>> diff --git a/drivers/w1/w1-uclass.c b/drivers/w1/w1-uclass.c
>> index 5544b19..ad86bf6 100644
>> --- a/drivers/w1/w1-uclass.c
>> +++ b/drivers/w1/w1-uclass.c
>> @@ -84,10 +84,6 @@ static int w1_enumerate(struct udevice *bus)
>>    			rn |= (tmp64 << i);
>>    		}
>>    
>> -		/* last device or error, aborting here */
>> -		if ((triplet_ret & 0x03) == 0x03)
>> -			last_device = true;
>> -
> Hello Martin,
>
> Your fix will basically make U-boot try again the same ID ? What happens
> if we keep getting errors at this triplet, we will be stuck in a loop ?
> When are we going to abort searching this ID if we keep getting errors ?


Yes it will retry again the same 64 bit sequence after a bus reset.

So theoretically yes we could get stuck in a loop if the same error 
keeps occuring at the same point.

However I'm not really convinced that it will occur in real life except 
for completely broken W1 devices or contrived scenarios (though you 
could argue that this would allow a malicious W1 device to do a DOS 
attack on u-boot...)

The bus reset (which actually cuts the power to the bus and hence hard 
resets all the devices on it) should ensure that any transient weirdness 
due to bus noise or software bugs on the W1 devices themselves won't 
cause us to loop and if the bus itself breaks we will exit at the 
beginning of the loop on the error return from ops->bus_reset() due to 
non detection of the presence pulse.


That said It would be easy to add a max retry counter to ensure that the 
pathological case cannot occur. But Linux doesn't do that and I'm not 
aware of any infinite loop cases there.

If you think we need the retry limit though I'll send a V2 with that.


Regards,

Martin




More information about the U-Boot mailing list