[PATCH v5 27/33] riscv: Fix race conditions when initializing IPI

Lukas Auer lukas at auer.io
Wed Mar 4 16:25:19 CET 2020


On Tue, 2020-03-03 at 16:57 -0500, Sean Anderson wrote:
> On 3/3/20 4:53 PM, Lukas Auer wrote:
> > On Mon, 2020-03-02 at 18:43 -0500, Sean Anderson wrote:
> > > On 3/2/20 6:17 PM, Lukas Auer wrote:
> > > > Don't move this. It is intended to be run before the IPI is cleared.
> > > 
> > > Hm, ok. I think I moved it to after because of the 'if (!smp_function)'
> > > check, but those two don't really need to be done together.
> > > 
> > 
> > Thanks! We had problems with code corruption in some situations,
> > because some secondary harts entered OpenSBI after the main hart while
> > OpenSBI expected all harts to be running OpenSBI by that time. Moving
> > this code block was part of the fix for this situation, see [1].
> > 
> > [1]: 
> > https://gitlab.denx.de/u-boot/u-boot/commit/90ae28143700bae4edd23930a7772899ad259058
> 
> Ah, this makes a lot more sense why it was located where it was.
> 
> > > > >  	/*
> > > > >  	 * Clear the IPI to acknowledge the request before jumping to the
> > > > >  	 * requested function.
> > > > >  	 */
> > > > >  	ret = riscv_clear_ipi(hart);
> > > > >  	if (ret) {
> > > > > -		pr_err("Cannot clear IPI of hart %ld\n", hart);
> > > > > +		pr_err("Cannot clear IPI of hart %ld (error %d)\n", hart, ret);
> > > > >  		return;
> > > > >  	}
> > > > >  
> > > > > +	__smp_mb();
> > > > > +
> > > > > +	smp_function = (void (*)(ulong, ulong, ulong))gd->arch.ipi[hart].addr;
> > > > > +	/*
> > > > > +	 * There may be an IPI raised before u-boot begins execution, so check
> > > > > +	 * to ensure we actually have a function to call.
> > > > > +	 */
> > > > > +	if (!smp_function)
> > > > > +		return;
> > > > > +	log_debug("hart = %lu func = %p\n", hart, smp_function);
> > > > 
> > > > The log messages might be corrupted if multiple harts are calling the
> > > > log function here. I have not looked into the details so this might not
> > > > be an issue. In that case it is fine to keep, otherwise please remove
> > > > it.
> > > 
> > > I ran into this problem a lot when debugging. I ended up implementing a
> > > spinlock around puts/putc. I agree it's probably better to remove this,
> > > but I worry that concurrency bugs will become much harder to track down
> > > without some kind of feedback. (This same criticism applies to the log
> > > message above as well).
> > > 
> > 
> > Especially with your changes, I hope we already have or will soon reach
> > a code robustness level where we won't have too many concurrency bugs
> > in the future. :)
> > Let's remove it for now until the logging backend can handle this
> > cleanly.
> 
> Ok. Should the error message above ("Cannot clear IPI of hart...") also
> be removed? I found it tended to corrupt the log output if it was ever
> triggered.
> 

Even though it's not ideal, we should keep it for now. Otherwise we
don't have a way to get notified about the error.

Thanks,
Lukas


More information about the U-Boot mailing list