[PATCH 1/1] doc: remove illegal characters

Heinrich Schuchardt xypron.glpk at gmx.de
Tue Jan 26 14:05:22 CET 2021


On 26.01.21 00:41, Pali Rohár wrote:
> On Tuesday 26 January 2021 00:33:29 Heinrich Schuchardt wrote:
>> On 1/25/21 11:13 PM, Pali Rohár wrote:
>>> Hello Heinrich!
>>>
>>> Does this change mean that UTF-8 is now disallowed in U-Boot?
>>>
>>> On Monday 25 January 2021 22:48:56 Heinrich Schuchardt wrote:
>>>> Avoid errors when generating the HTML documentation like:
>>>>
>>>>      'ascii' codec can't decode byte 0xc2
>>>
>>> This sounds like an incorrect configuration of tool which is generating
>>> HTML documentation.
>>
>> Sphinx uses as default:
>>
>> source_encoding = 'utf-8-sig'
>
> utf-8-sig is IIRC UTF-8 encoding with leading BOM.
>
>> With Sphinx 3 on my machine I have no problem. But with Sphinx 2 on
>> Gitlab an error is ejected for Unicode letters.
>>
>> I do not know if elder Sphinx versions require a BOM to mark a file as
>> UTF-8 when using utf-8-sig.
>
> According to UNICODE standards, it is not required and moreover it is
> not recommended to use BOM in UTF-8.
>
> So if some tool requires BOM in UTF-8 then it not compliant to UNICODE
> standard and I do not think it is a good idea to stop using UNICODE just
> because some tool cannot process UTF-8... But this is just my opinion
> and I do not know how Sphinx 2 works or how it is configured.
>
> Maybe... can you check if it is possible to reconfigure Sphinx 2 to use
> UTF-8 without BOM? Is not there just option source_encoding = 'utf-8' ?
>

The problem is caused by doc/sphinx/automarkup.py inherited from Linux.
The initial commit for the module says:

"Rather than fill our text files with :c:func:`function()` syntax, just
do the markup via a hook into the sphinx build process."

I am not able to reproduce the problem locally.

We can deactivate this module. Anyway the module would have to be
adjusted to match U-Boots file structure.

Best regards

Heinrich


More information about the U-Boot mailing list