[U-Boot] [PATCH] patman: encode CC list to UTF-8

Dr. Philipp Tomsich philipp.tomsich at theobroma-systems.com
Tue Apr 25 22:27:46 UTC 2017


Hi Simon,

> On 25 Apr 2017, at 22:31, Simon Glass <sjg at chromium.org> wrote:
> 
> Hi Tom,
> 
> On 25 April 2017 at 11:12, Tom Rini <trini at konsulko.com> wrote:
>> 
>> On Sat, Apr 22, 2017 at 05:53:36PM -0600, Simon Glass wrote:
>>> +Tom
>>> 
>>> On 19 April 2017 at 07:24, Philipp Tomsich
>>> <philipp.tomsich at theobroma-systems.com> wrote:
>>>> 
>>>> This change encodes the CC list to UTF-8 to avoid failures on
>>>> maintainer-addresses that include non-ASCII characters (observed on
>>>> Debian 7.11 with Python 2.7.3).
>>>> 
>>>> Without this, I get the following failure:
>>>>  Traceback (most recent call last):
>>>>    File "tools/patman/patman", line 159, in <module>
>>>>      options.add_maintainers)
>>>>    File "[snip]/u-boot/tools/patman/series.py", line 234, in MakeCcFile
>>>>      print(commit.patch, ', '.join(set(list)), file=fd)
>>>>  UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 81: ordinal not in range(128)
>>>> from Heiko's email address:
>>>>  [..., u'"Heiko St\xfcbner" <heiko at sntech.de>', ...]
>>>> 
>>>> While with this change added this encodes to:
>>>>  "=?UTF-8?q?Heiko=20St=C3=BCbner?= <heiko at sntech.de>"
>>>> 
>>>> Signed-off-by: Philipp Tomsich <philipp.tomsich at theobroma-systems.com>
>>>> ---
>>>> 
>>>> tools/patman/series.py | 4 ++--
>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>> 
>>> Reviewed-by: Simon Glass <sjg at chromium.org>
>> 
>> Please put this in a PR for me, along with any other critical fixes to
>> the various python tools we have, thanks!
>> 
>> And also, do we need to perhaps whack something at a higher level, and
>> more consistently, about unicode?  This is, I gather, doing UTF-8 right.
>> In buildman we have a few patches to just translate to latin-1 instead.
>> We should do the same thing I think, and perhaps there's a higher level
>> up in the code where we need to do it too?  I don't know..
> 
> Actually I don't think we are quite there yet. This really needs a
> test with all the different places strings can come from, to make sure
> patman does the right thing.

On the topic of ‘different places strings can come from’, here’s another
change from my WIP tree that fixes some other UTF-8 issues in patman
and may point you towards another trouble spot:

@@ -229,14 +229,16 @@ class Series(dict):
                                            raise_on_error=raise_on_error)
             if add_maintainers:
                 list += get_maintainer.GetMaintainer(commit.patch)
+            list = [s.encode('utf-8') for s in list]
             all_ccs += list
-            print(commit.patch, ', '.join(set(list)).encode('utf-8'), file=fd)
+            print(commit.patch, ', '.join(set(list)), file=fd)
             self._generated_cc[commit.patch] = list
 
         if cover_fname:
             cover_cc = gitutil.BuildEmailList(self.get('cover_cc', ''))
-            cc_list = ', '.join([x.decode('utf-8') for x in set(cover_cc + all_ccs)])
-            print(cover_fname, cc_list.encode('utf-8'), file=fd)
+            cover_cc = [s.encode('utf-8') for s in cover_cc]
+            cc_list = ', '.join([x for x in set(cover_cc + all_ccs)])
+            print(cover_fname, cc_list, file=fd)
 
         fd.close()
         return fname


Regards,
Philipp.


More information about the U-Boot mailing list