[PATCH] ufs: core: Fix heap corruption due to out of bounds write

Mon Mar 30 10:03:27 CEST 2026

On 3/30/26 01:11, Marek Vasut wrote:
> The ufshcd_read_string_desc() can perform out of bounds write and
> corrupt heap in case the input utf-16 string contains code points
> which convert to anything more than plain 7-bit ASCII string.
> 
> This occurs because utf16_to_utf8(dst, src, size) in U-Boot behaves
> differently than Linux utf16s_to_utf8s(..., maxlen), but the porting
> process did not take that into consideration. The U-Boot variant of
> the function converts up to $size utf-16 fixed-length 16-bit input
> characters into as many 1..4 Byte long variable-length utf-8 output
> characters. That means for 16 Byte input, the output can be up to 64
> Bytes long. The Linux variant converts up utf-16 input into up to
> $maxlen Bytes worth of utf-8 output, but stops at the $maxlen limit.
> That means for 16 Byte input with maxlen=32, the processing will stop
> after writing 32 output Bytes.
> 
> In case of U-Boot, use of utf16_to_utf8() leads to potential corruption
> of data past the $size Bytes and therefore corruption of surrounding
> content on the heap.
> 
> The fix is as simple, allocate buffer that is sufficient to fit the
> utf-8 string. The rest of the code in ufshcd_read_string_desc() does
> correctly limit the buffer to fit into the DMA descriptor afterward.
> 
> Signed-off-by: Marek Vasut <marek.vasut+renesas at mailbox.org>
> ---
> NOTE: This is for 2026.04, but please do test it on your hardware too.
> ---
> Cc: Bhupesh Sharma <bhupesh.linux at gmail.com>
> Cc: Julien Stephan <jstephan at baylibre.com>
> Cc: Michal Simek <michal.simek at amd.com>
> Cc: Neha Malcom Francis <n-francis at ti.com>
> Cc: Neil Armstrong <neil.armstrong at linaro.org>
> Cc: Padmarao Begari <padmarao.begari at amd.com>
> Cc: Tom Rini <trini at konsulko.com>
> Cc: u-boot at lists.denx.de
> ---
>   drivers/ufs/ufs-uclass.c | 10 +++++++++-
>   1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/ufs/ufs-uclass.c b/drivers/ufs/ufs-uclass.c
> index 81fd431f951..6a51f337e47 100644
> --- a/drivers/ufs/ufs-uclass.c
> +++ b/drivers/ufs/ufs-uclass.c
> @@ -1751,7 +1751,15 @@ static int ufshcd_read_string_desc(struct ufs_hba *hba, int desc_index,
>   			goto out;
>   		}
>   
> -		buff_ascii = kmalloc(ascii_len, GFP_KERNEL);

I think the whole function is a mess, I think we would rewrite with something like this:

...
int max_len;
int ascii_len;

max_len = (desc_len - QUERY_DESC_HDR_SIZE) * 2 + 1;
buff_ascii = kmalloc(max_len, GFP_KERNEL);

ascii_len = utf16_to_utf8(buff_ascii,
		(uint16_t *)&buf[QUERY_DESC_HDR_SIZE], max_len);
...

So we stop having random len, and use the _real_ len returned by utf16_to_utf8.

Neil

> +		/*
> +		 * utf-8 is encoded using up to 4-Bytes per character,
> +		 * however, we only allocate such a buffer because the
> +		 * utf16_to_utf8() converts the entire $ascii_len worth
> +		 * of input characters into up to 4-Byte long utf-8
> +		 * characters. The rest of the function uses only up to
> +		 * $ascii_len bytes of that utf-8 string.
> +		 */
> +		buff_ascii = kmalloc(ascii_len * 4, GFP_KERNEL);
>   		if (!buff_ascii) {
>   			err = -ENOMEM;
>   			goto out;