[PATCH v5 00/13] Add video damage tracking

Alexander Graf agraf at csgraf.de
Wed Aug 30 21:55:00 CEST 2023


On 29.08.23 11:19, Mark Kettenis wrote:
>> Date: Tue, 29 Aug 2023 08:20:49 +0200
>> From: Alexander Graf <agraf at csgraf.de>
>>
>> On 28.08.23 23:54, Heinrich Schuchardt wrote:
>>> On 8/28/23 22:24, Alexander Graf wrote:
>>>> On 28.08.23 19:54, Simon Glass wrote:
>>>>> Hi Alex,
>>>>>
>>>>> On Wed, 23 Aug 2023 at 02:56, Alexander Graf <agraf at csgraf.de> wrote:
>>>>>> Hey Simon,
>>>>>>
>>>>>> On 22.08.23 20:56, Simon Glass wrote:
>>>>>>> Hi Alex,
>>>>>>>
>>>>>>> On Tue, 22 Aug 2023 at 01:47, Alexander Graf <agraf at csgraf.de> wrote:
>>>>>>>> On 22.08.23 01:03, Simon Glass wrote:
>>>>>>>>> Hi Alex,
>>>>>>>>>
>>>>>>>>> On Mon, 21 Aug 2023 at 16:40, Alexander Graf <agraf at csgraf.de>
>>>>>>>>> wrote:
>>>>>>>>>> On 22.08.23 00:10, Simon Glass wrote:
>>>>>>>>>>> Hi Alex,
>>>>>>>>>>>
>>>>>>>>>>> On Mon, 21 Aug 2023 at 14:20, Alexander Graf <agraf at csgraf.de>
>>>>>>>>>>> wrote:
>>>>>>>>>>>> On 21.08.23 21:57, Simon Glass wrote:
>>>>>>>>>>>>> Hi Alex,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, 21 Aug 2023 at 13:33, Alexander Graf <agraf at csgraf.de>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> On 21.08.23 21:11, Simon Glass wrote:
>>>>>>>>>>>>>>> Hi Alper,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, 21 Aug 2023 at 07:51, Alper Nebi Yasak
>>>>>>>>>>>>>>> <alpernebiyasak at gmail.com> wrote:
>>>>>>>>>>>>>>>> This is a rebase of Alexander Graf's video damage tracking
>>>>>>>>>>>>>>>> series, with
>>>>>>>>>>>>>>>> some tests and other changes. The original cover letter is
>>>>>>>>>>>>>>>> as follows:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This patch set speeds up graphics output on ARM by a
>>>>>>>>>>>>>>>>> factor of 60x.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On most ARM SBCs, we keep the frame buffer in DRAM and map
>>>>>>>>>>>>>>>>> it as cached,
>>>>>>>>>>>>>>>>> but need it accessible by the display controller which
>>>>>>>>>>>>>>>>> reads directly
>>>>>>>>>>>>>>>>> from a later point of consistency. Hence, we flush the
>>>>>>>>>>>>>>>>> frame buffer to
>>>>>>>>>>>>>>>>> DRAM on every change. The full frame buffer.
>>>>>>>>>>>>>>> It should not, see below.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Unfortunately, with the advent of 4k displays, we are
>>>>>>>>>>>>>>>>> seeing frame buffers
>>>>>>>>>>>>>>>>> that can take a while to flush out. This was reported by
>>>>>>>>>>>>>>>>> Da Xue with grub,
>>>>>>>>>>>>>>>>> which happily print 1000s of spaces on the screen to draw
>>>>>>>>>>>>>>>>> a menu. Every
>>>>>>>>>>>>>>>>> printed space triggers a cache flush.
>>>>>>>>>>>>>>> That is a bug somewhere in EFI.
>>>>>>>>>>>>>> Unfortunately not :). You may call it a bug in grub: It
>>>>>>>>>>>>>> literally prints
>>>>>>>>>>>>>> over space characters for every character in its menu that it
>>>>>>>>>>>>>> wants
>>>>>>>>>>>>>> cleared. On every text screen draw.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This wouldn't be a big issue if we only flush the reactangle
>>>>>>>>>>>>>> that gets
>>>>>>>>>>>>>> modified. But without this patch set, we're flushing the full
>>>>>>>>>>>>>> DRAM
>>>>>>>>>>>>>> buffer on every u-boot text console character write, which
>>>>>>>>>>>>>> means for
>>>>>>>>>>>>>> every character (as that's the only API UEFI has).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> As a nice side effect, we speed up the normal U-Boot text
>>>>>>>>>>>>>> console as
>>>>>>>>>>>>>> well with this patch set, because even "normal" text prints
>>>>>>>>>>>>>> that write
>>>>>>>>>>>>>> for example a single line of text on the screen today flush
>>>>>>>>>>>>>> the full
>>>>>>>>>>>>>> frame buffer to DRAM.
>>>>>>>>>>>>> No, I mean that it is a bug that U-Boot (apparently) flushes
>>>>>>>>>>>>> the cache
>>>>>>>>>>>>> after every character. It doesn't do that for normal character
>>>>>>>>>>>>> output
>>>>>>>>>>>>> and I don't think it makes sense to do it for EFI either.
>>>>>>>>>>>> I see. Let's trace the calls:
>>>>>>>>>>>>
>>>>>>>>>>>> efi_cout_output_string()
>>>>>>>>>>>> -> fputs()
>>>>>>>>>>>> -> vidconsole_puts()
>>>>>>>>>>>> -> video_sync()
>>>>>>>>>>>> -> flush_dcache_range()
>>>>>>>>>>>>
>>>>>>>>>>>> Unfortunately grub abstracts character backends down to the
>>>>>>>>>>>> "print
>>>>>>>>>>>> character" level, so it calls UEFI's sopisticated
>>>>>>>>>>>> "output_string"
>>>>>>>>>>>> callback with single characters at a time, which means we do a
>>>>>>>>>>>> full
>>>>>>>>>>>> dcache flush for every character that we print:
>>>>>>>>>>>>
>>>>>>>>>>>> https://git.savannah.gnu.org/cgit/grub.git/tree/grub-core/term/efi/console.c#n165
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This patch set implements the easiest mitigation against
>>>>>>>>>>>>>>>>> this problem:
>>>>>>>>>>>>>>>>> Damage tracking. We remember the lowest common denominator
>>>>>>>>>>>>>>>>> region that was
>>>>>>>>>>>>>>>>> touched since the last video_sync() call and only flush
>>>>>>>>>>>>>>>>> that. The most
>>>>>>>>>>>>>>>>> typical writer to the frame buffer is the video console,
>>>>>>>>>>>>>>>>> which always
>>>>>>>>>>>>>>>>> writes rectangles of characters on the screen and syncs
>>>>>>>>>>>>>>>>> afterwards.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> With this patch set applied, we reduce drawing a large
>>>>>>>>>>>>>>>>> grub menu (with
>>>>>>>>>>>>>>>>> serial console attached for size information) on an
>>>>>>>>>>>>>>>>> RK3399-ROC system
>>>>>>>>>>>>>>>>> at 1440p from 55 seconds to less than 1 second.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Version 2 also implements VIDEO_COPY using this mechanism,
>>>>>>>>>>>>>>>>> reducing its
>>>>>>>>>>>>>>>>> overhead compared to before as well. So even x86 systems
>>>>>>>>>>>>>>>>> should be faster
>>>>>>>>>>>>>>>>> with this now :).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Alternatives considered:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>          1) Lazy sync - Sandbox does this. It only calls
>>>>>>>>>>>>>>>>> video_sync(true) ever
>>>>>>>>>>>>>>>>>             so often. We are missing timers to do this
>>>>>>>>>>>>>>>>> generically.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>          2) Double buffering - We could try to identify
>>>>>>>>>>>>>>>>> whether anything changed
>>>>>>>>>>>>>>>>>             at all and only draw to the FB if it did. That
>>>>>>>>>>>>>>>>> would require
>>>>>>>>>>>>>>>>>             maintaining a second buffer that we need to
>>>>>>>>>>>>>>>>> scan.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>          3) Text buffer - Maintain a buffer of all text
>>>>>>>>>>>>>>>>> printed on the screen with
>>>>>>>>>>>>>>>>>             respective location. Don't write if the old and
>>>>>>>>>>>>>>>>> new character are
>>>>>>>>>>>>>>>>>             identical. This would limit applicability to
>>>>>>>>>>>>>>>>> text only and is an
>>>>>>>>>>>>>>>>>             optimization on top of this patch set.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>          4) Hash screen lines - Create a hash (sha256?)
>>>>>>>>>>>>>>>>> over every line when it
>>>>>>>>>>>>>>>>>             changes. Only flush when it does. I'm not sure
>>>>>>>>>>>>>>>>> if this would waste
>>>>>>>>>>>>>>>>>             more time, memory and cache than the current
>>>>>>>>>>>>>>>>> approach. It would make
>>>>>>>>>>>>>>>>>             full screen updates much more expensive.
>>>>>>>>>>>>>>> 5) Fix the bug mentioned above?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Changes in v5:
>>>>>>>>>>>>>>>> - Add patch "video: test: Split copy frame buffer check
>>>>>>>>>>>>>>>> into a function"
>>>>>>>>>>>>>>>> - Add patch "video: test: Support checking copy frame
>>>>>>>>>>>>>>>> buffer contents"
>>>>>>>>>>>>>>>> - Add patch "video: test: Test partial updates of hardware
>>>>>>>>>>>>>>>> frame buffer"
>>>>>>>>>>>>>>>> - Use xstart, ystart, xend, yend as names for damage region
>>>>>>>>>>>>>>>> - Document damage struct and fields in struct video_priv
>>>>>>>>>>>>>>>> comment
>>>>>>>>>>>>>>>> - Return void from video_damage()
>>>>>>>>>>>>>>>> - Fix undeclared priv error in video_sync()
>>>>>>>>>>>>>>>> - Drop unused headers from video-uclass.c
>>>>>>>>>>>>>>>> - Use IS_ENABLED() instead of CONFIG_IS_ENABLED()
>>>>>>>>>>>>>>>> - Call video_damage() also in video_fill_part()
>>>>>>>>>>>>>>>> - Use met->baseline instead of priv->baseline
>>>>>>>>>>>>>>>> - Use fontdata->height/width instead of
>>>>>>>>>>>>>>>> VIDEO_FONT_HEIGHT/WIDTH
>>>>>>>>>>>>>>>> - Update console_rotate.c video_damage() calls to pass
>>>>>>>>>>>>>>>> video tests
>>>>>>>>>>>>>>>> - Remove mention about not having minimal damage for
>>>>>>>>>>>>>>>> console_rotate.c
>>>>>>>>>>>>>>>> - Add patch "video: test: Test video damage tracking via
>>>>>>>>>>>>>>>> vidconsole"
>>>>>>>>>>>>>>>> - Document new vdev field in struct efi_gop_obj comment
>>>>>>>>>>>>>>>> - Remove video_sync_copy() also from video_fill(),
>>>>>>>>>>>>>>>> video_fill_part()
>>>>>>>>>>>>>>>> - Fix memmove() calls by removing the extra dev argument
>>>>>>>>>>>>>>>> - Call video_sync() before checking copy_fb in video tests
>>>>>>>>>>>>>>>> - Imply VIDEO_DAMAGE for video drivers instead of
>>>>>>>>>>>>>>>> selecting it
>>>>>>>>>>>>>>>> - Imply VIDEO_DAMAGE also for VIDEO_TIDSS
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> v4:
>>>>>>>>>>>>>>>> https://lore.kernel.org/all/20230103215004.22646-1-agraf@csgraf.de/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Changes in v4:
>>>>>>>>>>>>>>>> - Move damage clear to patch "dm: video: Add damage
>>>>>>>>>>>>>>>> tracking API"
>>>>>>>>>>>>>>>> - Simplify first damage logic
>>>>>>>>>>>>>>>> - Remove VIDEO_DAMAGE default for ARM
>>>>>>>>>>>>>>>> - Skip damage on EfiBltVideoToBltBuffer
>>>>>>>>>>>>>>>> - Add patch "video: Always compile cache flushing code"
>>>>>>>>>>>>>>>> - Add patch "video: Enable VIDEO_DAMAGE for drivers that
>>>>>>>>>>>>>>>> need it"
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> v3:
>>>>>>>>>>>>>>>> https://lore.kernel.org/all/20221230195828.88134-1-agraf@csgraf.de/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Changes in v3:
>>>>>>>>>>>>>>>> - Adapt to always assume DM is used
>>>>>>>>>>>>>>>> - Adapt to always assume DM is used
>>>>>>>>>>>>>>>> - Make VIDEO_COPY always select VIDEO_DAMAGE
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> v2:
>>>>>>>>>>>>>>>> https://lore.kernel.org/all/20220609225921.62462-1-agraf@csgraf.de/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Changes in v2:
>>>>>>>>>>>>>>>> - Remove ifdefs
>>>>>>>>>>>>>>>> - Fix ranges in truetype target
>>>>>>>>>>>>>>>> - Limit rotate to necessary damage
>>>>>>>>>>>>>>>> - Remove ifdefs from gop
>>>>>>>>>>>>>>>> - Fix dcache range; we were flushing too much before
>>>>>>>>>>>>>>>> - Add patch "video: Use VIDEO_DAMAGE for VIDEO_COPY"
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> v1:
>>>>>>>>>>>>>>>> https://lore.kernel.org/all/20220606234336.5021-1-agraf@csgraf.de/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Alexander Graf (9):
>>>>>>>>>>>>>>>>          dm: video: Add damage tracking API
>>>>>>>>>>>>>>>>          dm: video: Add damage notification on display fills
>>>>>>>>>>>>>>>>          vidconsole: Add damage notifications to all
>>>>>>>>>>>>>>>> vidconsole drivers
>>>>>>>>>>>>>>>>          video: Add damage notification on bmp display
>>>>>>>>>>>>>>>>          efi_loader: GOP: Add damage notification on BLT
>>>>>>>>>>>>>>>>          video: Only dcache flush damaged lines
>>>>>>>>>>>>>>>>          video: Use VIDEO_DAMAGE for VIDEO_COPY
>>>>>>>>>>>>>>>>          video: Always compile cache flushing code
>>>>>>>>>>>>>>>>          video: Enable VIDEO_DAMAGE for drivers that need it
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Alper Nebi Yasak (4):
>>>>>>>>>>>>>>>>          video: test: Split copy frame buffer check into a
>>>>>>>>>>>>>>>> function
>>>>>>>>>>>>>>>>          video: test: Support checking copy frame buffer
>>>>>>>>>>>>>>>> contents
>>>>>>>>>>>>>>>>          video: test: Test partial updates of hardware frame
>>>>>>>>>>>>>>>> buffer
>>>>>>>>>>>>>>>>          video: test: Test video damage tracking via
>>>>>>>>>>>>>>>> vidconsole
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>         arch/arm/mach-omap2/omap3/Kconfig |   1 +
>>>>>>>>>>>>>>>>         arch/arm/mach-sunxi/Kconfig |   1 +
>>>>>>>>>>>>>>>>         drivers/video/Kconfig |  26 +++
>>>>>>>>>>>>>>>>         drivers/video/console_normal.c |  27 ++--
>>>>>>>>>>>>>>>>         drivers/video/console_rotate.c |  94 +++++++----
>>>>>>>>>>>>>>>>         drivers/video/console_truetype.c |  37 +++--
>>>>>>>>>>>>>>>>         drivers/video/exynos/Kconfig |   1 +
>>>>>>>>>>>>>>>>         drivers/video/imx/Kconfig |   1 +
>>>>>>>>>>>>>>>>         drivers/video/meson/Kconfig |   1 +
>>>>>>>>>>>>>>>>         drivers/video/rockchip/Kconfig |   1 +
>>>>>>>>>>>>>>>>         drivers/video/stm32/Kconfig |   1 +
>>>>>>>>>>>>>>>>         drivers/video/tegra20/Kconfig |   1 +
>>>>>>>>>>>>>>>>         drivers/video/tidss/Kconfig |   1 +
>>>>>>>>>>>>>>>>         drivers/video/vidconsole-uclass.c |  16 --
>>>>>>>>>>>>>>>>         drivers/video/video-uclass.c | 190
>>>>>>>>>>>>>>>> ++++++++++++----------
>>>>>>>>>>>>>>>>         drivers/video/video_bmp.c |   7 +-
>>>>>>>>>>>>>>>>         include/video.h |  59 +++----
>>>>>>>>>>>>>>>>         include/video_console.h |  52 ------
>>>>>>>>>>>>>>>>         lib/efi_loader/efi_gop.c |   7 +
>>>>>>>>>>>>>>>>         test/dm/video.c | 256
>>>>>>>>>>>>>>>> ++++++++++++++++++++++++------
>>>>>>>>>>>>>>>>         20 files changed, 483 insertions(+), 297 deletions(-)
>>>>>>>>>>>>>>> It is good to see this tidied up into something that can be
>>>>>>>>>>>>>>> applied!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I am unsure what is going on with the EFI performance,
>>>>>>>>>>>>>>> though. It
>>>>>>>>>>>>>>> should not flush the cache after every character, only after
>>>>>>>>>>>>>>> a new
>>>>>>>>>>>>>>> line. Is there something wrong in here? If so, we should fix
>>>>>>>>>>>>>>> that bug
>>>>>>>>>>>>>>> first and it should be patch 1 of this series.
>>>>>>>>>>>>>> Before I came up with this series, I was trying to identify
>>>>>>>>>>>>>> the UEFI bug
>>>>>>>>>>>>>> in question as well, because intuition told me surely this is
>>>>>>>>>>>>>> a bug in
>>>>>>>>>>>>>> UEFI :). Turns out it really isn't this time around.
>>>>>>>>>>>>> I don't mean a bug in UEFI, I mean a bug in U-Boot's EFI
>>>>>>>>>>>>> implementation. Where did you look for the bug?
>>>>>>>>>>>> The "real" bug is in grub. But given that it's reasonably
>>>>>>>>>>>> simple to work
>>>>>>>>>>>> around in U-Boot and even with it "fixed" in grub we would
>>>>>>>>>>>> still see
>>>>>>>>>>>> performance benefits from flushing only parts of the screen, I
>>>>>>>>>>>> think
>>>>>>>>>>>> it's worth living with the grub deficiency.
>>>>>>>>>>> OK thanks for digging into it. I suggest we add a param to
>>>>>>>>>>> vidconsole_puts() to tell it whether to sync or not, then the
>>>>>>>>>>> EFI code
>>>>>>>>>>> can indicate this and try to be a bit smarter about it.
>>>>>>>>>> It doesn't know when to sync either. From its point of view, any
>>>>>>>>>> "console output" could be the last one. There is no API in UEFI
>>>>>>>>>> that
>>>>>>>>>> says "please flush console output now".
>>>>>>>>> Yes, I understand. I was not suggesting we were missing an API. But
>>>>>>>>> some sort of heuristic would do, e.g. only flush on a newline,
>>>>>>>>> flush
>>>>>>>>> every 50 chars, etc.
>>>>>>>> I can't think of any heuristic that would reliably work. Relevant
>>>>>>>> for
>>>>>>>> this conversation, UEFI provides 2 calls:
>>>>>>>>
>>>>>>>>       * Write string to screen (efi_cout_output_string)
>>>>>>>>       * Set text cursor position to X, Y
>>>>>>>> (efi_cout_set_cursor_position)
>>>>>>>>
>>>>>>>> It's perfectly legal for a UEFI application to do something like
>>>>>>>>
>>>>>>>> efi_cout_set_cursor_position(10, 10);
>>>>>>>> efi_cout_output_string("f");
>>>>>>>> efi_cout_output_string("o");
>>>>>>>> efi_cout_output_string("o") ;
>>>>>>>>
>>>>>>>> to update contents of a virtual text box on the screen. Where in
>>>>>>>> this
>>>>>>>> chain of events would we call video_sync(), but on every call to
>>>>>>>> efi_cout_output_string()?
>>>>>>> Actually U-Boot has the same problem, but we have managed to work
>>>>>>> out something.
>>>>>> U-Boot as a code base has a much easier stance: It can add APIs
>>>>>> when it
>>>>>> needs them in places that require them. With UEFI (as well as the
>>>>>> U-Boot
>>>>>> native API), we're stuck with what's there.
>>>>>>
>>>>>> I also don't understand what you mean by "we have managed to work out
>>>>>> something". This patch set is not a UEFI fix - it fixes generic U-Boot
>>>>>> behavior and speeds up non-UEFI boots as well. The improvement
>>>>>> there is
>>>>>> just not as impressive as with grub :).
>>>>> We are still not quite on the same page...
>>>>>
>>>>> U-Boot does have video_sync() but it doesn't know when to call it. If
>>>>> it does not call it, then any amount of single-threaded code can run
>>>>> after that, which may update the framebuffer. In other words, U-Boot
>>>>> is in exactly the same boat as UEFI. It has to decide whether to call
>>>>> video_sync() based on some sort of heuristic.
>>>>>
>>>>> That is the only point I am trying to make here. Does that make sense?
>>>>
>>>> Oh, I thought you mentioned above that U-Boot is in a better spot or
>>>> "has it solved already". I agree - it's in the same boat and the only
>>>> safe thing it can really do today that is fully cross-platform
>>>> compatible is to call video_sync() after every character.
>>>>
>>>> I don't understand what you mean by "any amount of single-threaded code
>>>> can run after that, which may update the framebuffer". Any framebuffer
>>>> modification is U-Boot internal code which then again can apply
>>>> video_sync() to tell the system "I want what I wrote to screen actually
>>>> be on screen now". I don't think that's necessarily bad design. A bit
>>>> clunky, but we're in a pre-boot environment after all.
>>>>
>>>> Since we're aligned now: What exactly did you refer to with "but we have
>>>> managed to work out something"?
>>> Should we set PixelBltOnly to indicate to UEFI applications that they
>>> are not allowed to directly write to the framebuffer but always have to
>>> use BitBlt? GRUB seems to be using a shadow buffer by default which it
>>> copies via BitBlt.
>>
>> If we do that, OSs will no longer be able to carry the frame buffer
>> address over and continue to use it with to draw on the screen natively
>> (like Linux's efifb).
>>
>> So no, I don't think we should indicate PixelBltOnly. The frame buffer
>> is usually available to applications, you just need to adhere to the
>> architecture's caching constraints.
> Right.  That would probably kill any reasonable way we can have an
> early framebuffer console in most OSes after ExitBootServices() has
> been called.
>
> I'm late to the game but isn't the real solution to have U-Boot map
> the framebuffer in a cache-coherent way?  This is what typically
> happens on x86 where VRAM is mapped as "write-combining".  That is,
> uncached but going through the store buffer to speed up writes.
> Reading from the framebuffer will be slow in that case (which probably
> is the real reason why grub uses a shadow framebuffer).  So U-Boot
> still needs to some cleverness to make sure it only ever writes to the
> framebuffer.


Yeah, those are the 2 options: WB plus flush or WC plus shadow buffer 
for reads. I don't really see much benefit in doing the latter over the 
former: It occupies more valuable RAM and adds additional complexity on 
FB reads. I believe the main reason x86 went that route was that it had 
no choice: It didn't have a cache line flush instruction for a long time.


Alex



More information about the U-Boot mailing list