[PATCH 1/1] doc: remove illegal characters

Pali Rohár pali at kernel.org
Tue Jan 26 00:41:41 CET 2021


On Tuesday 26 January 2021 00:33:29 Heinrich Schuchardt wrote:
> On 1/25/21 11:13 PM, Pali Rohár wrote:
> > Hello Heinrich!
> > 
> > Does this change mean that UTF-8 is now disallowed in U-Boot?
> > 
> > On Monday 25 January 2021 22:48:56 Heinrich Schuchardt wrote:
> > > Avoid errors when generating the HTML documentation like:
> > > 
> > >      'ascii' codec can't decode byte 0xc2
> > 
> > This sounds like an incorrect configuration of tool which is generating
> > HTML documentation.
> 
> Sphinx uses as default:
> 
> source_encoding = 'utf-8-sig'

utf-8-sig is IIRC UTF-8 encoding with leading BOM.

> With Sphinx 3 on my machine I have no problem. But with Sphinx 2 on
> Gitlab an error is ejected for Unicode letters.
> 
> I do not know if elder Sphinx versions require a BOM to mark a file as
> UTF-8 when using utf-8-sig.

According to UNICODE standards, it is not required and moreover it is
not recommended to use BOM in UTF-8.

So if some tool requires BOM in UTF-8 then it not compliant to UNICODE
standard and I do not think it is a good idea to stop using UNICODE just
because some tool cannot process UTF-8... But this is just my opinion
and I do not know how Sphinx 2 works or how it is configured.

Maybe... can you check if it is possible to reconfigure Sphinx 2 to use
UTF-8 without BOM? Is not there just option source_encoding = 'utf-8' ?


More information about the U-Boot mailing list