[U-Boot] [U-Boot, v0, 07/20] vsprintf.c: add wide string (%ls) support

Tue Aug 8 22:03:50 UTC 2017

On 08/04/2017 09:31 PM, Rob Clark wrote:
> This is convenient for efi_loader which deals a lot with utf16.
> 
> Signed-off-by: Rob Clark <robdclark at gmail.com>

Please, put this patch together with
[PATCH] vsprintf.c: add GUID printing
https://patchwork.ozlabs.org/patch/798362/
and
[PATCH v0 06/20] common: add some utf16 handling helpers
https://patchwork.ozlabs.org/patch/797968/
into a separate patch series.

These three patches can be reviewed independently of the efi_loader
patches and probably will not be integrated via the efi-next tree.

> ---
>  lib/vsprintf.c | 30 ++++++++++++++++++++++++++++--
>  1 file changed, 28 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/vsprintf.c b/lib/vsprintf.c
> index 874a2951f7..0c40f852ce 100644
> --- a/lib/vsprintf.c
> +++ b/lib/vsprintf.c
> @@ -17,6 +17,7 @@
>  #include <linux/ctype.h>
>  
>  #include <common.h>
> +#include <charset.h>
>  
>  #include <div64.h>
>  #define noinline __attribute__((noinline))
> @@ -270,6 +271,26 @@ static char *string(char *buf, char *end, char *s, int field_width,
>  	return buf;
>  }
>  
> +static char *string16(char *buf, char *end, u16 *s, int field_width,
> +		int precision, int flags)
> +{
> +	u16 *str = s ? s : L"<NULL>";
Please, do not use the L-notation here as it requires -fshort-wchar.
As we currently cannot switch the complete project to C11 you cannot use
the u-notation either.

> +	int len = utf16_strnlen(str, precision);
> +	u8 utf8[len * MAX_UTF8_PER_UTF16];

Didn't you forget 1 byte for \0 here?

This is what strlnlen does:

The strnlen() function returns the number of characters in the string
pointed to by s, **excluding** the terminating null byte ('\0'), but at
most maxlen.

I would expect the exclusion of the terminating null word by an
utf16_strnlen function.

> +	int i;
> +
> +	*utf16_to_utf8(utf8, str, len) = '\0';
> +
> +	if (!(flags & LEFT))
> +		while (len < field_width--)
> +			ADDCH(buf, ' ');
> +	for (i = 0; i < len; ++i)
> +		ADDCH(buf, utf8[i]);
> +	while (len < field_width--)
> +		ADDCH(buf, ' ');
> +	return buf;
> +}
> +
>  #ifdef CONFIG_CMD_NET
>  static const char hex_asc[] = "0123456789abcdef";
>  #define hex_asc_lo(x)	hex_asc[((x) & 0x0f)]
> @@ -528,8 +549,13 @@ repeat:
>  			continue;
>  
>  		case 's':
> -			str = string(str, end, va_arg(args, char *),
> -				     field_width, precision, flags);
> +			if (qualifier == 'l') {

%ls refers to wchar with implementation dependent width in the C standard.
There is no qualifier for 16-bit wchar. Couldn't we use %us here in
reference to the u-notation ( u'MyString' ). This would leave the path
open for a standard compliant '%ls'.

Best regards

Heinrich

> +				str = string16(str, end, va_arg(args, u16 *),
> +					       field_width, precision, flags);
> +			} else {
> +				str = string(str, end, va_arg(args, char *),
> +					     field_width, precision, flags);
> +			}
>  			continue;
>  
>  		case 'p':
>