Passing boot logs to Linux?

Tue Dec 26 23:38:06 CET 2023

On 2023. 12. 26. 19:09, Dragan Simic wrote:
> Hello,
> 
> On 2023-12-26 10:46, Simon Glass wrote:
>> On Thu, Dec 21, 2023 at 2:23 AM Dragan Simic <dsimic at manjaro.org> wrote:
>>> On 2023-12-21 02:44, Dragan Simic wrote:
>>> > On 2023-12-21 02:37, Dragan Simic wrote:
>>> >> On 2023-12-21 02:03, Daniel Golle wrote:
>>> >>> On Thu, Dec 21, 2023 at 12:55:20AM +0100, Dragan Simic wrote:
>>> >>>> On 2023-12-21 00:27, Csókás Bence wrote:
>>> >>>> > Not every system has eMMC/uSD, and as you said, these 
>>> arguments don't
>>> >>>> > hold for a 4 MB SPI NAND, for example, one you might find in 
>>> an OpenWrt
>>> >>>> > router for example. Whereas RAM is quite cheap nowadays.
>>> >>>>
>>> >>>> I see, but I also wonder how many such OpenWrt routers are still
>>> >>>> used these
>>> >>>> days, and, even more importantly, how many of them are regularly
>>> >>>> updated and
>>> >>>> can be expected to actually use this new feature?
>>> >>>
>>> >>> Avoid flash writes is a very important matter, even on systems with
>>> >>> 128 MiB of SPI-NAND flash which is by far the most common setup you
>>> >>> find on off-the-shelf plastic routers and access points nowadays.
>>> >>
>>> >> I agree, writing something to the SPI chips all the time, no matter
>>> >> how small the writes are, is a big no-no, which I already clearly
>>> >> expressed in one of my earlier posts.
>>> >>
>>> >>> Especially also as those devices often come without a local console,
>>> >>> having U-Boot's output prepended to dmesg on boot would be a very 
>>> big
>>> >>> win.
>>> >>
>>> >> I was also thinking about that, but I'm not sure it would be accepted
>>> >> to the Linux kernel.  Maybe we can try getting that accepted later.
>>>
>>> Maybe, but just _maybe,_ it could be possible to add a new command-line
>>> option to dmesg(1) that would display the last recorded console output,
>>> fetched from the pstore.  That _might_ get accepted to util-linux, while
>>> being perfectly fine from the usability standpoint.
>>>
>>> I'm also willing to work on that, and I already contributed a few
>>> dmesg(1) patches to util-linux.
>>>
>>> >>>> Please, don't get me wrong, I still support having both options
>>> >>>> available,
>>> >>>> but I'm also wondering about the target demographic.
>>> >>>>
>>> >>>> > > > Plus, I don't want the console subsystem to depend on any 
>>> file/disk
>>> >>>> > > > operations/drivers.
>>> >>>> > >
>>> >>>> > > Well, the console would still work as usual even if logging 
>>> to disk
>>> >>>> > > would fail for any reason, which is similar to the serial 
>>> console
>>> >>>> > > still
>>> >>>> > > working if the graphical console fails.  Moreover, if the 
>>> disk fails,
>>> >>>> > > the system isn't be able to boot, so any RAM-based console logs
>>> >>>> > > would be
>>> >>>> > > lost in that case.  All this makes the RAM-based logging no 
>>> more
>>> >>>> > > resilient to disk failures.
>>> >>>> >
>>> >>>> > Correction: if disk *reads* fail, as well as writes, then the 
>>> system
>>> >>>> > will not boot. However, typical failure of Flash media is that it
>>> >>>> > becomes read-only.
>>> >>>>
>>> >>>> That's a good point, but having a read-only root filesystem usually
>>> >>>> also
>>> >>>> means having a non-operational system that can only have its stored
>>> >>>> data
>>> >>>> salvaged.  Unless the system is specifically crafted to survive 
>>> such
>>> >>>> scenarios, of course.
>>> >>>
>>> >>> ... which holds true for any decent embedded OS, which at least
>>> >>> allows
>>> >>> limited remote access and some kind of recovery even in this
>>> >>> situation.
>>> >>
>>> >> Perhaps.  I'm more into running general-purpose Linux 
>>> distributions on
>>> >> single-board computers and derived embedded devices, which are on the
>>> >> "thick" end of the embedded device spectrum, so to speak.
>>> >>
>>> >>>> > > I still think that using disk-based pstore is a better 
>>> option.  Just
>>> >>>> > > as
>>> >>>> > > you don't want to wear out your flash disks with 30-40 KB of 
>>> data, I
>>> >>>> > > also don't want to waste 30-40 KB of RAM.
>>> >>>> >
>>> >>>> > As I said, you could just unload the log after you're done 
>>> processing
>>> >>>> > it. 40 KB RAM is less, than what `sshd` uses, for instance 
>>> (860k on my
>>> >>>> > laptop, but it can probably be less, maybe even 10x less, so 
>>> 80-90k?),
>>> >>>> > so you could, in your init, process the in-RAM log, then 
>>> unload it, then
>>> >>>> > start your other services, thereby reclaiming that RAM.
>>> >>>>
>>> >>>> Using pstore should have that unloading already covered, and the
>>> >>>> already
>>> >>>> existing systemd service is there to perform the archiving to the
>>> >>>> primary
>>> >>>> filesystem, if desired so.  It would all need to be tested in
>>> >>>> detail, of
>>> >>>> course.
>>> >>>
>>> >>> pstore/ramoops uses a statically assigned reserved memory region, so
>>> >>> in
>>> >>> the moment you want to use that feature you loose that amount of RAM
>>> >>> (a
>>> >>> few kB, so it doesn't really matter on modern systems).
>>> >>> As in: there is *no* dynamic allocation.
>>> >>>
>>> >>> Imho using pstore/ramoops (which is a more or less Linux-specific
>>> >>> debugging feature, meant to store one or more timestamped logs of
>>> >>> crashes) might not be the most suitable choice. I understand the
>>> >>> advantages of using existing infrastructure, but on the other hand
>>> >>> we don't need most of the complexity of pstore for the task.
>>> >>
>>> >> Hmm, let me research all that a bit more in the next couple of days,
>>> >> please, and I'll come back with a detailed insight.
>>> >>
>>> >>> What I'd like to see is having a couple of log lines from U-Boot
>>> >>> prepended to Linux' dmesg buffer, and for that we anyway will 
>>> have to
>>> >>> come up with a way to hand over that buffer. Another reserved-memory
>>> >>> region would be one way, embedding the buffer as a blob into the
>>> >>> /chosen/ section of the device tree would be another way.
>>> >>
>>> >> As I already wrote above, it would be rather neat, but I'm afraid it
>>> >> wouldn't be accepted upstream.
>>> >
>>> > Sorry, I forgot this...
>>> >
>>> > As I already explained in one of my earlier posts, not all devices and
>>> > applications would benefit from having only "in-flight" console data
>>> > available, i.e. made available through a reserved region of the RAM.
>>> > Some devices actually need to have the recorded console outputs stored
>>> > on their eMMC chips or microSD cards, for post-mortem analysis of the
>>> > boot issues.
>>> >
>>> > That's why we actually need both options available, and that's also
>>> > why pstore should fit well, even if it may seem as an overkill.  I
>>> > hope you agree;  however, I'll do more research, as already promised.
>>
>> I did send this a while back, in case it is useful:
>>
>> https://www.spinics.net/lists/devicetree/msg573692.html

So, what's the status on that, did it get merged? Or did it also get 
stuck at "maybe we should use pstore"?

> This looks quite informative, thank you.  I'll make sure to read the 
> entire thread carefully.
> 
> Unfortunately, I've contracted some kind of flu that has rendered me 
> incapable of even thinking straight.  I'm hoping to get better in the 
> next few days.

I'm sad to hear that. Get better soon!

In the meantime, I think I'll send a PATCH RFC (I'll CC y'all) with a 
minimal implementation, that just collects the log and writes it out to 
a block of memory, with the struct I described earlier. That way we will 
be in the Merge Window, and then hopefully we can chisel out the best 
implementation by the last RC, whether that be PStores, this schema 
proposed by Simon, or the minimal implementation.

In that time, you are free to send in PStore writing support, filtering 
ANSI codes from CONSOLE_RECORD or whatever else you want me to use.

Bence