[PATCH] riscv: set the width of the physical address/size data type based on arch

Wed May 7 17:48:39 CEST 2025

On Wed, May 07, 2025 at 03:11:38PM +0530, Sughosh Ganu wrote:
> On Wed, 7 May 2025 at 13:19, Sughosh Ganu <sughosh.ganu at linaro.org> wrote:
> >
> > On Tue, 6 May 2025 at 16:35, Heinrich Schuchardt
> > <heinrich.schuchardt at canonical.com> wrote:
> > >
> > >
> > >
> > > Sughosh Ganu <sughosh.ganu at linaro.org> schrieb am Di., 6. Mai 2025, 12:50:
> > >>
> > >> On Tue, 6 May 2025 at 15:19, Heinrich Schuchardt
> > >> <heinrich.schuchardt at canonical.com> wrote:
> > >> >
> > >> > On 5/6/25 11:24, Sughosh Ganu wrote:
> > >> > > U-Boot has support for both the 32-bit and 64-bit RiscV platforms. Set
> > >> > > the width of the phys_{addr,size}_t data types based on the register
> > >> > > size of the architecture.
> > >> > >
> > >> > > Currently, even the 32-bit RiscV platforms have a 64-bit
> > >> > > phys_{addr,size}_t data types. This causes issues on the 32-bit
> > >> > > platforms, where the upper 32-bits of the variables of these types
> > >> > > can have junk data, and that can cause all kinds of side-effects.
> > >> >
> > >> > How could it be that the upper 32-bit have junk data?
> > >> >
> > >> > When we convert from a shorter variable the compiler should fill the
> > >> > upper bits with zero.
> > >>
> > >> That does not seem to be happening. The efi_fit test fails on the
> > >> qemu-riscv32 platform, when attempting to boot the OS from the FIT
> > >> image.
> > >>
> > >> These are the values of the base address that I see in the
> > >> _lmb_alloc_addr() function.
> > >>
> > >> _lmb_alloc_addr: 755, rgn => -1, base => 0x1a1c0e00802000bc, size => 0x50b1
> > >
> > >
> > > As you are running on QEMU you should be able to track down where the value is actually assigned with gdb. This could for instance be a buffer overrun.
> >
> > I was able to hook up gdb and re-create the issue. What I observe is
> > that when the lmb_allocate_mem() function is called, the base address
> > parameter, which is 64-bits, shows a value with the upper 32-bits not
> > zeroed out. So, this looks like a compiler issue, where the upper
> > 32-bits are not being zeroed out. Fwiw, this shows up with the
> > compiler being used in the CI environment, as well as the one that I
> > am using.
> 
> Thinking a bit on this, I don't think this is a compiler issue. The
> problem is that we are using the ulong type in some places(especially
> in the boot* commands) for storing the address values, while we use
> phys_addr_t in other places. And because this is a pointer being
> passed across functions, when the data-type that the pointer is
> pointing to changes from a 32-bit to 64-bit value, the upper 32-bits
> get considered. So the issue is that we use ulong in some places, and
> phys_addr_t in others for storing the addresses.
> 
> But I think that the solution for this(at least for now) is to set
> phys_addr_t based on the underlying architecture. In the long run,
> there needs to be an audit of the usage of ulong for storing
> addresses, and that needs to be changed to phys_addr_t.

Thanks for digging in to this more. I agree with what you're saying here
for both the short and long term.

-- 
Tom
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 659 bytes
Desc: not available
URL: <https://lists.denx.de/pipermail/u-boot/attachments/20250507/d4fafb74/attachment.sig>