[gitdm PATCH 2/2] logparser.py: Try and be more robust with unicode handling
Tom Rini
trini at konsulko.com
Tue Jul 12 13:05:25 CEST 2022
On Tue, Jul 12, 2022 at 04:58:46AM -0600, Simon Glass wrote:
> On Thu, 7 Jul 2022 at 13:22, Tom Rini <trini at konsulko.com> wrote:
> >
> > Given the sometimes oddly formatted data that can come through when
> > removing code, we need to be as flexible as possible when handling it.
> > Set our encoding to unicode_escape and if we still run in to a problem,
> > it's likely going to be OK to ignore it.
> >
> > Signed-off-by: Tom Rini <trini at konsulko.com>
> > ---
> > I've emailed this to Jonathan Corbet as well as he's the upstream for
> > the project, and this does work for me. But I'm not a python guru by
> > any means. But trying to run the stats for v2022.04..v2022.07-rc6 blows
> > up in places otherwise.
> >
> > logparser.py | 1 +
> > 1 file changed, 1 insertion(+)
>
> Reviewed-by: Simon Glass <sjg at chromium.org>
>
> BTW I have found that using binary is helpful in many places, the
> convert to UTF-8 when displaying things.
>
>
> >
> > diff --git a/logparser.py b/logparser.py
> > index efbc72f868eb..d5906e97689d 100644
> > --- a/logparser.py
> > +++ b/logparser.py
> > @@ -37,6 +37,7 @@ class LogPatchSplitter:
> > self.fd = fd
> > self.buffer = None
> > self.patch = []
> > + sys.stdin.reconfigure(encoding='unicode_escape', errors='ignore')
> >
> > def __iter__(self):
> > return self
So, I followed up with Jonathan, but hadn't yet for the list.
unicode_escape works, but then the results don't read right. It turned
out utf-8 was the right encoding, but the first time I tried testing it
I had some other problem locally.
--
Tom
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 659 bytes
Desc: not available
URL: <https://lists.denx.de/pipermail/u-boot/attachments/20220712/075c9b88/attachment.sig>
More information about the U-Boot
mailing list