[PATCH 08/10] env: Use strncpy() instead of ad-hoc code to copy variable value

Tue Oct 12 14:45:01 CEST 2021

On 12/10/2021 14.00, Marek Behún wrote:
> On Tue, 12 Oct 2021 13:41:43 +0200
> Rasmus Villemoes <rasmus.villemoes at prevas.dk> wrote:
> 
>> On 12/10/2021 13.04, Marek Behún wrote:

>> I understand why you want to avoid an open-coded copying, but strncpy
>> is almost certainly the wrong tool for the job. Apart from not
>> revealing anything about the actual length of the source string, it
>> also has the well-known defect of zeroing the tail of the buffer in
>> the (usual) case where the source is shorter than the buffer.
> 
> U-Boot's strncpy does not do that:
>  * Note that unlike userspace strncpy, this does not %NUL-pad the
> buffer.
>  * However, the result is not %NUL-terminated if the source exceeds
>  * @count bytes.

Yeah, I just saw that. That's extremely dangerous, and can cause silent
wrong code generation. The compiler knows the semantics of various
standard functions, including strncpy, and can optimize accordingly. For
example, look at

https://godbolt.org/z/855s7hsEE

In g1, we see that gcc has recognized strncpy() and open-codes it by two
immediate stores (6384738 == 0x616c62 == ascii "bla"), including the
trailing zero padding. Then in g2, gcc further realizes that the
conditional is always false, hence folds the whole thing to "return -1"
(though it does not manage to elide the then-dead stores to buf).

So, I wouldn't bet that in cases where the compiler _does_ end up
emitting a call to strncpy() that it doesn't somehow also rely on the
implementation actually honouring the semantics required by the C standard.

For another example, see

https://godbolt.org/z/W6W3od

The only way it is ok for gcc to compile copy_and_do to the exact same
machine code as copy_and_do2 is because it knows memcpy returns its dst
argument, hence (in the x86-64 case) it can prepare the first argument
to do_stuff via a "mov rdi, rax", instead of spilling and reloading p.

Well, U-Boot seems to pessimize the build with -fno-builtin making much
of the above moot. But:

On pa-risc, until very recently, "prctl(PR_SET_NAME, "x"); ....;
prctl(PR_GET_NAME)" was a simple way to read 14 bytes of kernel stack
from userspace because pa-risc's strncpy didn't do that zeroing. Of
course U-Boot doesn't really have problems with that kind of info-leak,
but given the amount of code U-Boot imports from other projects, not
least linux, there's certainly a fair chance that some of that code
relies on strncpy's semantics for correctness.

Rasmus