[U-Boot] [PATCH 8/8] x86: quark: Optimize MRC execution time

Bin Meng bmeng.cn at gmail.com
Mon Aug 31 16:04:01 CEST 2015


Hi Simon,

On Mon, Aug 31, 2015 at 9:54 PM, Simon Glass <sjg at chromium.org> wrote:
> Hi Bin,
>
> On 31 August 2015 at 07:43, Bin Meng <bmeng.cn at gmail.com> wrote:
>> Hi Simon,
>>
>> On Mon, Aug 31, 2015 at 9:20 PM, Simon Glass <sjg at chromium.org> wrote:
>>> Hi Bin,
>>>
>>> On 31 August 2015 at 03:52, Bin Meng <bmeng.cn at gmail.com> wrote:
>>>> Boot time performance degradation is observed with the conversion
>>>> to use dm pci. Intel Quark SoC has a low end x86 processor with
>>>> only 400MHz frequency and the most time consuming part is with MRC.
>>>> Each MRC register programming requires indirect access via pci bus.
>>>> With dm pci, accessing pci configuration space has some overhead.
>>>> Unfortunately this single access overhead gets accumulated in the
>>>> whole MRC process, and finally leads to twice boot time (25 seconds)
>>>> than before (12 seconds).
>>>>
>>>> To speed up the boot, create an optimized version of pci config
>>>> read/write routines without bothering to go through driver model.
>>>> Now it only takes about 3 seconds to finish MRC, which is really
>>>> fast (8 times faster than dm pci, or 4 times faster than before).
>>>>
>>>> Signed-off-by: Bin Meng <bmeng.cn at gmail.com>
>>>> ---
>>>>
>>>>  arch/x86/cpu/quark/msg_port.c | 59 +++++++++++++++++++++++++++----------------
>>>>  1 file changed, 37 insertions(+), 22 deletions(-)
>>>
>>> Before I delve into the patch - with driver model we are using the I/O
>>> method - see pci_x86_read_config(). Is that the source of the slowdown
>>> or is it just general driver model overhead.
>>
>> The MRC calls APIs in arch/x86/cpu/quark/msg_port.c to program DDR
>> controller. Inside msg_port.c, pci_write_config_dword() and
>> pci_read_config_dword() are called.
>>
>> With driver model, the overhead is:
>>
>> pci_write_config_dword() -> pci_write_config32() -> pci_write_config()
>> -> uclass_get_device_by_seq() then pci_bus_write_config() will finally
>> call pci_x86_read_config().
>>
>> Without driver model, there is still some overhead (so previously the
>> MRC time was about 12 seconds)
>>
>> pci_write_config_dword() -> pci_hose_write_config_dword() ->
>> TYPE1_PCI_OP(write, dword, u32, outl, 0)
>>
>> With my optimized version, pci_write_config_dword() directly calls a
>> hardcoded dword size pci config access, without the need to consider
>> offset and mask, and dereferencing hose->cfg_addr/cfg->data.
>
> What about if we use dm_pci_read_config32()? We should try to move PCI
> access to driver model to avoid the uclass_get_device_by_seq()
> everywhere.

I don't think that helps. dm_pci_read_config32() requires a dm driver.
MRC is just something that program a bunch of registers with pci
config rw call.

>
>>
>>>
>>> If the former then perhaps we should change this. If the latter then
>>> we have work to do...
>>>

Also for this patch, I just realized that not only it helps to reduce
the boot time, but also it helps to support PCIe root port in future
patches.

By checking the Quark SoC manual, I found something that needs to be
done to get the two PCIe root ports on Quark SoC to work. In order to
get PCIe root ports show up in the PCI configuration space, we need
program some registers in the message bus (using APIs in
arch/x86/cpu/quark/msg_port.c). With driver model, if we still call
pci_write_config_dword() in the msg_port.c, we end up triggering PCI
enumeration process first before we actually write something to the
configuration space, however the enumeration process will hang when
scanning to the PCIe root port if we don't properly initialize the
message bus registers. This is a chicken-egg problem.

Regards,
Bin


More information about the U-Boot mailing list