[U-Boot] [U-Boot, v0, 07/20] vsprintf.c: add wide string (%ls) support

Tue Aug 8 23:20:38 UTC 2017


> Am 09.08.2017 um 00:08 schrieb Heinrich Schuchardt <xypron.glpk at gmx.de>:
> 
>> On 08/09/2017 12:44 AM, Rob Clark wrote:
>>> On Tue, Aug 8, 2017 at 6:03 PM, Heinrich Schuchardt <xypron.glpk at gmx.de> wrote:
>>>> On 08/04/2017 09:31 PM, Rob Clark wrote:
>>>> This is convenient for efi_loader which deals a lot with utf16.
>>>> 
>>>> Signed-off-by: Rob Clark <robdclark at gmail.com>
>>> 
>>> Please, put this patch together with
>>> [PATCH] vsprintf.c: add GUID printing
>>> https://patchwork.ozlabs.org/patch/798362/
>>> and
>>> [PATCH v0 06/20] common: add some utf16 handling helpers
>>> https://patchwork.ozlabs.org/patch/797968/
>>> into a separate patch series.
>>> 
>>> These three patches can be reviewed independently of the efi_loader
>>> patches and probably will not be integrated via the efi-next tree.
>> 
>> I'll resend these as a separate patchset, and just not in next
>> revision of efi_loader patchset that it is a dependency
>> 
>>>> ---
>>>> lib/vsprintf.c | 30 ++++++++++++++++++++++++++++--
>>>> 1 file changed, 28 insertions(+), 2 deletions(-)
>>>> 
>>>> diff --git a/lib/vsprintf.c b/lib/vsprintf.c
>>>> index 874a2951f7..0c40f852ce 100644
>>>> --- a/lib/vsprintf.c
>>>> +++ b/lib/vsprintf.c
>>>> @@ -17,6 +17,7 @@
>>>> #include <linux/ctype.h>
>>>> 
>>>> #include <common.h>
>>>> +#include <charset.h>
>>>> 
>>>> #include <div64.h>
>>>> #define noinline __attribute__((noinline))
>>>> @@ -270,6 +271,26 @@ static char *string(char *buf, char *end, char *s, int field_width,
>>>>      return buf;
>>>> }
>>>> 
>>>> +static char *string16(char *buf, char *end, u16 *s, int field_width,
>>>> +             int precision, int flags)
>>>> +{
>>>> +     u16 *str = s ? s : L"<NULL>";
>>> Please, do not use the L-notation here as it requires -fshort-wchar.
>>> As we currently cannot switch the complete project to C11 you cannot use
>>> the u-notation either.
>> 
>> current plan was to either switch whole project to -fshort-wchar or
>> c11 and rework these patches (as well as a few patches in the
>> efi_loader patchset).  (In the c11 case, I'm not sure what we'll use
>> as the fmt string, since afaict that isn't specified.  We could use %S
>> although that seems to be a deprecated way to do %ls, or something
>> different like %A, I guess)..
>> 
>> how far are we from c11?  If there is stuff I can do to help let me
>> know.  If feasible, I'd rather do that first rather than have a bunch
>> of stuff in vsprintf and elsewhere that needs to be cleaned up later
>> after the switch.
> 
> buildman downloads very old compilers (gcc < 4.8) from kernel.org which
> do not support C11.
> Travis CI uses Ubuntu 14.04 with gcc 4.8.4 which incorrectly throws an
> error for disk/part.c in C11 mode.
> 
> To get things right we would have to
> * build our own cross tool chains based on a current gcc version
> * use our own tool chain in Travis for x86-64 or use a docker
>  container with a current gcc version.
> 
> In the long run heading for C11 would be the right thing to do.
> Until then use an initializer { '<', 'N', 'U', 'L', 'L', '>' }.

Don't forget the , 0 :)

Alex

> It looks ugly but does not consume more bytes once compiled.
> 
> Regards
> 
> Heinrich
> 
>> 
>>> 
>>>> +     int len = utf16_strnlen(str, precision);
>>>> +     u8 utf8[len * MAX_UTF8_PER_UTF16];
>>> 
>>> Didn't you forget 1 byte for \0 here?
>>> 
>>> This is what strlnlen does:
>>> 
>>> The strnlen() function returns the number of characters in the string
>>> pointed to by s, **excluding** the terminating null byte ('\0'), but at
>>> most maxlen.
>>> 
>>> I would expect the exclusion of the terminating null word by an
>>> utf16_strnlen function.
>> 
>> you are right, but fixing the wrong problem.. the code is definitely
>> wrong since length of a utf16 string != length of a utf8 string, and
>> we don't need to append a null terminator.. so my logic below using
>> 'len' is wrong.  I'll fix that in the next version.
>> 
>>>> +     int i;
>>>> +
>>>> +     *utf16_to_utf8(utf8, str, len) = '\0';
>>>> +
>>>> +     if (!(flags & LEFT))
>>>> +             while (len < field_width--)
>>>> +                     ADDCH(buf, ' ');
>>>> +     for (i = 0; i < len; ++i)
>>>> +             ADDCH(buf, utf8[i]);
>>>> +     while (len < field_width--)
>>>> +             ADDCH(buf, ' ');
>>>> +     return buf;
>>>> +}
>>>> +
>>>> #ifdef CONFIG_CMD_NET
>>>> static const char hex_asc[] = "0123456789abcdef";
>>>> #define hex_asc_lo(x)        hex_asc[((x) & 0x0f)]
>>>> @@ -528,8 +549,13 @@ repeat:
>>>>                      continue;
>>>> 
>>>>              case 's':
>>>> -                     str = string(str, end, va_arg(args, char *),
>>>> -                                  field_width, precision, flags);
>>>> +                     if (qualifier == 'l') {
>>> 
>>> %ls refers to wchar with implementation dependent width in the C standard.
>>> There is no qualifier for 16-bit wchar. Couldn't we use %us here in
>>> reference to the u-notation ( u'MyString' ). This would leave the path
>>> open for a standard compliant '%ls'.
>> 
>> hmm, yeah, that is a clever idea, and I like it better than %A or %S..
>> so if we go the c11 route I'll do that.  The c11 committee should have
>> thought of that ;-)
>> 
>> BR,
>> -R
>> 
>>> Best regards
>>> 
>>> Heinrich
>>> 
>>>> +                             str = string16(str, end, va_arg(args, u16 *),
>>>> +                                            field_width, precision, flags);
>>>> +                     } else {
>>>> +                             str = string(str, end, va_arg(args, char *),
>>>> +                                          field_width, precision, flags);
>>>> +                     }
>>>>                      continue;
>>>> 
>>>>              case 'p':
>>>> 
>> 
>