[U-Boot] [PATCH 8/8] x86: quark: Optimize MRC execution time

Simon Glass sjg at chromium.org
Mon Aug 31 15:54:46 CEST 2015


Hi Bin,

On 31 August 2015 at 07:43, Bin Meng <bmeng.cn at gmail.com> wrote:
> Hi Simon,
>
> On Mon, Aug 31, 2015 at 9:20 PM, Simon Glass <sjg at chromium.org> wrote:
>> Hi Bin,
>>
>> On 31 August 2015 at 03:52, Bin Meng <bmeng.cn at gmail.com> wrote:
>>> Boot time performance degradation is observed with the conversion
>>> to use dm pci. Intel Quark SoC has a low end x86 processor with
>>> only 400MHz frequency and the most time consuming part is with MRC.
>>> Each MRC register programming requires indirect access via pci bus.
>>> With dm pci, accessing pci configuration space has some overhead.
>>> Unfortunately this single access overhead gets accumulated in the
>>> whole MRC process, and finally leads to twice boot time (25 seconds)
>>> than before (12 seconds).
>>>
>>> To speed up the boot, create an optimized version of pci config
>>> read/write routines without bothering to go through driver model.
>>> Now it only takes about 3 seconds to finish MRC, which is really
>>> fast (8 times faster than dm pci, or 4 times faster than before).
>>>
>>> Signed-off-by: Bin Meng <bmeng.cn at gmail.com>
>>> ---
>>>
>>>  arch/x86/cpu/quark/msg_port.c | 59 +++++++++++++++++++++++++++----------------
>>>  1 file changed, 37 insertions(+), 22 deletions(-)
>>
>> Before I delve into the patch - with driver model we are using the I/O
>> method - see pci_x86_read_config(). Is that the source of the slowdown
>> or is it just general driver model overhead.
>
> The MRC calls APIs in arch/x86/cpu/quark/msg_port.c to program DDR
> controller. Inside msg_port.c, pci_write_config_dword() and
> pci_read_config_dword() are called.
>
> With driver model, the overhead is:
>
> pci_write_config_dword() -> pci_write_config32() -> pci_write_config()
> -> uclass_get_device_by_seq() then pci_bus_write_config() will finally
> call pci_x86_read_config().
>
> Without driver model, there is still some overhead (so previously the
> MRC time was about 12 seconds)
>
> pci_write_config_dword() -> pci_hose_write_config_dword() ->
> TYPE1_PCI_OP(write, dword, u32, outl, 0)
>
> With my optimized version, pci_write_config_dword() directly calls a
> hardcoded dword size pci config access, without the need to consider
> offset and mask, and dereferencing hose->cfg_addr/cfg->data.

What about if we use dm_pci_read_config32()? We should try to move PCI
access to driver model to avoid the uclass_get_device_by_seq()
everywhere.

>
>>
>> If the former then perhaps we should change this. If the latter then
>> we have work to do...
>>

Regards,
Simon


More information about the U-Boot mailing list