[PATCH v2 6/7] riscv: Ensure gp is NULL or points to valid data

Sean Anderson seanga2 at gmail.com
Wed Sep 16 12:56:30 CEST 2020


On 9/15/20 10:23 PM, Rick Chen wrote:
> Hi Sean
> 
>> From: U-Boot [mailto:u-boot-bounces at lists.denx.de] On Behalf Of Sean Anderson
>> Sent: Monday, September 14, 2020 10:23 PM
>> To: u-boot at lists.denx.de
>> Cc: Alan Quey-Liang Kao(高魁良); Leo Yu-Chi Liang(梁育齊); Heinrich Schuchardt; Bin Meng; Rick Chen; Sean Anderson
>> Subject: [PATCH v2 6/7] riscv: Ensure gp is NULL or points to valid data
>>
>> This allows code to use a construct like `if (gd & gd->...) { ... }` when
>> accessing the global data pointer. Without this change, it was possible for
>> a very early trap to cause _exit_trap to read arbitrary memory. This could
>> cause a second trap, preventing show_regs from being printed.
>>
>> XIP cannot use locks because flash is not writable. This leaves it
>> vulnerable to the same class of bugs regarding already-pending IPIs as
>> before this series. Fixing that would require finding another method of
>> synchronization, which is outside the scope of this series.
>>
>> Fixes: 7c6ca03eae ("riscv: additional crash information")
>> Signed-off-by: Sean Anderson <seanga2 at gmail.com>
>> ---
>>
>> Changes in v2:
>> - Set gp early with XIP
>>
>>  arch/riscv/cpu/start.S      | 26 +++++++++++++++++++++++---
>>  arch/riscv/lib/interrupts.c |  3 ++-
>>  2 files changed, 25 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/riscv/cpu/start.S b/arch/riscv/cpu/start.S
>> index 66ca1c7020..a16af79fbe 100644
>> --- a/arch/riscv/cpu/start.S
>> +++ b/arch/riscv/cpu/start.S
>> @@ -47,6 +47,13 @@ _start:
>>         mv      tp, a0
>>         mv      s1, a1
>>
>> +       /*
>> +        * Set the global data pointer to a known value in case we get a very
>> +        * early trap. The global data pointer will be set its actual value only
>> +        * after it has been initialized.
>> +        */
>> +       mv      gp, zero
>> +
>>         la      t0, trap_entry
>>         csrw    MODE_PREFIX(tvec), t0
>>
>> @@ -85,10 +92,10 @@ call_board_init_f_0:
>>         jal     board_init_f_alloc_reserve
>>
>>         /*
>> -        * Set global data pointer here for all harts, uninitialized at this
>> -        * point.
>> +        * Save global data pointer for later. We don't set it here because it
>> +        * is not initialized yet.
>>          */
>> -       mv      gp, a0
>> +       mv      s0, a0
>>
>>         /* setup stack */
>>  #if CONFIG_IS_ENABLED(SMP)
>> @@ -109,6 +116,12 @@ call_board_init_f_0:
>>         amoswap.w s2, t1, 0(t0)
>>         bnez    s2, wait_for_gd_init
>>  #else
>> +       /*
>> +        * FIXME: gp is set before it is initialized. If an XIP U-Boot ever
>> +        * encounters a pending IPI on boot it is liable to jump to whatever
>> +        * memory happens to be in ipi_data.addr on boot.
>> +        */
>> +       mv      gp, s0
>>         bnez    tp, secondary_hart_loop
>>  #endif
>>
>> @@ -133,6 +146,13 @@ wait_for_gd_init:
>>  1:     amoswap.w.aq t1, t1, 0(t0)
>>         bnez    t1, 1b
>>
>> +       /*
>> +        * Set the global data pointer only when gd_t has been initialized.
>> +        * This was already set by arch_setup_gd on the boot hart, but all other
>> +        * harts' global data pointers gets set here.
>> +        */
>> +       mv      gp, s0
>> +
>
> May I ask some questions ?
> This patch is trying to prevent printing garbage messages form early
> trap or early trap itself ?

It is trying to prevent garbase messages and secondary early traps, but
not the initial early trap. Printf (and specifically puts) uses gd to
determine what function to print with. These functions in turn use gd to
find the serial device, etc. However, before accessing gd, puts first
checks to see if it is non-NULL. This indicates an existing (perhaps
undocumented) assumption that either gd is NULL or it is completely
valid.

> When early trap occurs, how is the status of mtvec ?

mtvec will always point to handle_trap after the first few instructions
executed by U-Boot. Before that, the prior stage's mtvec will be used.
However, because all instructions before setting mtvec are
regicter-register moves, I don't think it is likely for there to be a
trap (unless there is a bug such as enabling RISCV_ISA_C on a processor
which doesn't have that extension).

A more interesting question is "what is the status of gd during an early
trap"? I will compare the situation before and after this patch.

Before this patch, gd either points to unexpected data (because it
retains the value it did from the prior-stage) or points to
uninitialized data (because it has not yet been initialized by
board_init_f_init_reserve) until the hart has acquired
available_harts_lock. This can cause two problems, depending on the
value of gd->flags. If GD_FLG_SERIAL_READY is unset, then some garbage
data will be printed to stdout, but there will not be a second trap.
However, if GD_FLG_SERIAL_READY is set, then puts will try to print with
serial_puts, which will likely cause a second trap.

After this patch, gd is zero up until either a hart has set it in
wait_for_gd_init, or until it is set by arch_init_gd. This prevents its
usage before its data is initialized because both handle_trap and puts
ensure that gd is nonzero before using it. After gd has been set, it
is OK to access it because its data has been cleared (and so flags is
valid).

> How the early trap be handled in handle_trap ?

The same as a late trap, except _debug_uart_init may not have been
called. Fortunately for us, if the prior stage correctly initialized the
sifive uart, then it will work without initialization.

> This method seem a little speculative and complicated to read.
> Otherwise if the board_init_f_init_reserve() be modified in the future
> and the gp will not synchronize any more.
> board_init_f_init_reserve is belong generic flow, it can be touch by
> anyone and anytime.

And then it is their problem for breaking boot on other boards :)

This commit relies on two assumptions from the rest of U-boot. First,
that a check is made that gd is non-NULL before accessing it in code
which can be called very early. Second, that arch_init_gd is called
after gd is cleared. I think the first is likely to be enforced, given
its existance in several functions (in addition to puts, watchdog_reset
also does a check). The second is also likely to be enforced, because gd
needs to be cleared anyway, and the ordering of 3 lines is unlikely to
warrant significant change.

--Sean



More information about the U-Boot mailing list