[RFC PATCH 00/28] cli: Add a new shell
Tom Rini
trini at konsulko.com
Thu Jul 1 22:21:55 CEST 2021
On Thu, Jul 01, 2021 at 02:15:43AM -0400, Sean Anderson wrote:
> Well, this has been sitting on my hard drive for too long without feedback
> ("Release early, release often"), so here's the first RFC. This is not ready to
> merge (see the "Future work" section below), but the shell is functional and at
> least partially tested.
>
> The goal is to have 0 bytes gained over Hush. Currently we are around 800 bytes
> over on sandbox.
A good goal, but perhaps slightly too strict?
>
> add/remove: 90/54 grow/shrink: 3/7 up/down: 12834/-12042 (792)
>
> = Getting started
>
> Enable CONFIG_LIL. If you would like to run tests, enable CONFIG_LIL_FULL. Note
> that dm_test_acpi_cmd_dump and setexpr_test_str_oper will fail. CONFIG_LIL_POOLS
> is currently broken (with what appears to be a double free).
>
> For an overview of the language as a whole, refer to the original readme [1].
>
> [1] http://runtimeterror.com/tech/lil/readme.txt
>
> == Key patches
>
> The following patches are particularly significant for reviewing and
> understanding this series:
>
> cli: Add LIL shell
> This contains the LIL shell as originally written by Kostas with some
> major deletions and some minor additions.
> cli: lil: Wire up LIL to the rest of U-Boot
> This allows you to use LIL as a shell just like Hush.
> cli: lil: Document structures
> This adds documentation for the major structures of LIL. It is a good
> place to start looking at the internals.
> test: Add tests for LIL
> This adds some basic integration tests and provides some examples of
> LIL code.
> cli: lil: Add a distinct parsing step
> This adds a parser separate from the interpreter. This patch is the
> largest original work in this series.
> cli: lil: Load procs from the environment
> This allows procedures to be saved and loaded like variables.
>
> = A new shell
>
> This series adds a new shell for U-Boot. The aim is to eventually replace Hush
> as the primary shell for all boards which currently use it. Hush should be
> replaced because it has several major problems:
>
> - It has not had a major update in two decades, resulting in duplication of
> effort in finding bugs. Regarding a bug in variable setting, Wolfgang remarks
>
> So the specific problem has (long) been fixed in upstream, and
> instead of adding a patch to our old version, thus cementing the
> broken behaviour, we should upgrade hush to recent upstream code.
>
> -- Wolfgang Denk [2]
>
> These lack of updates are further compounded by a significant amount of
> ifdef-ing in the Hush code. This makes the shell hard to read and debug.
> Further, the original purpose of such ifdef-ing (upgrading to a newer Hush)
> has never happened.
>
> - It was designed for a preempting OS which supports pipes and processes. This
> fundamentally does not match the computing model of U-Boot where there is
> exactly one thread (and every other CPU is spinning or sleeping). Working
> around these design differences is a significant cause of the aformentioned
> ifdef-ing.
>
> - It lacks many major features expected of even the most basic shells, such
> as functions and command substitution ($() syntax). This makes it difficult
> to script with Hush. While it is desirable to write some code in C, much code
> *must* be written in C because there is no way to express the logic in Hush.
>
> I believe that U-Boot should have a shell which is more featureful, has cleaner
> code, and which is the same size as Hush (or less). The ergonomic advantages
> afforded by a new shell will make U-Boot easier to use and customize.
>
> [2] https://lore.kernel.org/u-boot/872080.1614764732@gemini.denx.de/
First, great! Thanks for doing this. A new shell really is the only
viable path forward here, and I appreciate you taking the time to
evaluate several and implement one.
> = Open questions
>
> While the primary purpose of this series is of course to get feedback on the
> code I have already written, there are several decisions where I am not sure
> what the best course of action is.
>
> - What should be done about 'expr'? The 'expr' command is a significant portion
> of the final code size. It cannot be removed outright, because it is used by
> several builtin functions like 'if', 'while', 'for', etc. The way I see it,
> there are two general approaches to take
>
> - Rewrite expr to parse expressions and then evaluate them. The parsing could
> re-use several of the existing parse functions like how parse_list does.
> This could reduce code, as instead of many functions each with their own
> while/switch statements, we could have two while/switch statements (one to
> parse, and one to evaluate). However, this may end up increasing code size
> (such as when the main language had evaluation split from parsing).
>
> - Don't parse infix expressions, and just make arithmetic operators normal
> functions. This would affect ergonomics a bit. For example, instead of
>
> if {$i < 10} { ... }
>
> one would need to write
>
> if {< $i 10} { ... }
>
> and instead of
>
> if {$some_bool} { ... }
>
> one would need to write
>
> if {quote $some_bool} { ... }
>
> Though, given how much setexpr is used (not much), this may not be such a
> big price to pay. This route is almost certain to reduce code size.
So, this is a question because we have cmd/setexpr.c that provides
"expr" today? Or because this is a likely place to reclaim some of that
800 byte growth?
> - How should LIL functions integrate with the rest of U-Boot? At the moment, lil
> functions and procedures exist in a completely separate world from normal
> commands. I would like to integrate them more closely, but I am not sure the
> best way to go about this. At the very minimum, each LIL builtin function
> needs to get its hands on the LIL interpreter somehow. I'd rather this didn't
> happen through gd_t or similar so that it is easier to unit test.
> Additionally, LIL functions expect an array of lil_values instead of strings.
> We could strip them out, but I worry that might start to impact performance
> (from all the copying).
I might be missing something here. But, given that whenever we have C
code run-around and generate a string to then pass to the interpreter to
run, someone asks why we don't just make API calls directly, perhaps the
answer is that we don't need to?
>
> The other half of this is adding LIL features into regular commands. The most
> important feature here is being able to return a string result. I took an
> initial crack at it [3], but I think with this series there is a stronger
> motivating factor (along with things like [4]).
>
> [3] https://patchwork.ozlabs.org/project/uboot/list/?series=231377
> [4] https://patchwork.ozlabs.org/project/uboot/list/?series=251013
>
> = Future work
>
> The series as presented today is incomplete. The following are the major issues
> I see with it at the moment. I would like to address all of these issues, but
> some of them might be postponed until after first merging this series.
>
> - There is a serious error handling problem. Most original LIL code never
> checked errors. In almost every case, errors were silently ignored, even
> malloc failures! While I have designed new code to handle errors properly,
> there still remains a significant amount of original code which just ignores
> errors. In particular, I would like to ensure that the following categories of
> error conditions are handled:
>
> - Running out of memory.
> - Access to a nonexistant variable.
> - Passing the wrong number of arguments to a function.
> - Interpreting a value as the wrong type (e.g. "foo" should not have a numeric
> representation, instead of just being treated as 1).
>
> - There are many deviations from TCL with no purpose. For example, the list
> indexing function is named "index" and not "lindex". It is perfectly fine to
> drop features or change semantics to reduce code size, make parsing easier,
> or make execution easier. But changing things for the sake of it should be
> avoided.
>
> - The test suite is rather anemic compared with the amount of code this
> series introduces. I would like to expand it significantly. In particular,
> error conditions are not well tested (only the "happy path" is tested).
>
> - While I have documented all new functions I have written, there are many
> existing functions which remain to be documented. In addition, there is no
> user documentation, which is critical in driving adoption of any new
> programming language. Some of this cover letter might be integrated with any
> documentation written.
>
> - Some shell features such as command repetition and secondary shell prompts
> have not been implemented.
>
> - Arguments to native lil functions are incompatible with U-Boot functions. For
> example, the command
>
> foo bar baz
>
> would be passed to a U-Boot command as
>
> { "foo", "bar", "baz", NULL }
>
> but would be passed to a LIL function as
>
> { "bar", "baz" }
>
> This makes it more difficult to use the same function to parse several
> different commands. At the moment this is solved by passing the command name
> in lil->env->proc, but I would like to switch to the U-Boot argument list
> style.
>
> - Several existing tests break when using LIL because they expect no output on
> failure, but LIL produces some output notifying the user of the failure.
>
> - Implement DISTRO_BOOT in LIL. I think this is an important proof-of-concept to
> show what can be done with LIL, and to determine which features should be
> moved to LIL_FULL.
>
> = Why Lil?
>
> When looking for a suitable replacement shell, I evaluated implementations using
> the following criteria:
>
> - It must have a GPLv2-compatible license.
> - It must be written in C, and have no major external dependencies.
> - It must support bare function calls. That is, a script such as 'foo bar'
> should invoke the function 'foo' with the argument 'bar'. This preserves the
> shell-like syntax we expect.
> - It must be small. The eventual target is that it compiles to around 10KiB with
> -Os and -ffunction-sections.
> - There should be good tests. Any tests at all are good, but a functioning suite
> is better.
> - There should be good documentation
> - There should be comments in the source.
> - It should be "finished" or have only slow development. This will hopefully
> make it easier to port changes.
On this last point, I believe this is based on lil20190821 and current
is now lil20210502. With a quick diff between them, I can see that the
changes there are small enough that while you've introduced a number of
changes here, it would be a very easy update.
> Notably absent from the above list is performance. Most scripts in U-Boot will
> be run once on boot. As long as the time spent evaluating scripts is kept under
> a reasonable threshold (a fraction of the time spend initializing hardware or
> reading data from persistant storage), there is no need to optimize for speed.
>
> In addition, I did not consider updating Hush from Busybox. The mismatch in
> computing environment expectations (as noted in the "New shell" section above)
> still applies. IMO, this mismatch is the biggest reason that things like
> functions and command substitution have been excluded from the U-Boot's Hush.
>
> == lil
>
> - zLib
> - TCL
> - Compiles to around 10k with no builtins. To 25k with builtins.
> - Some tests, but not organized into a suite with expected output. Some evidence
> that the author ran APL, but no harness.
> - Some architectural documentation. Some for each functions, but not much.
> - No comments :l
> - 3.5k LoC
>
> == picol
>
> - 2-clause BSD
> - TCL
> - Compiles to around 25k with no builtins. To 80k with builtins.
> - Tests with suite (in-language). No evidence of fuzzing.
> - No documentation :l
> - No comments :l
> - 5k LoC
>
> == jimtcl
>
> - 2-clause BSD
> - TCL
> - Compiles to around 95k with no builtins. To 140k with builtins. Too big...
>
> == boron
>
> - LGPLv3+ (so this is right out)
> - REBOL
> - Compiles to around 125k with no builtins. To 190k with builtins. Too big...
>
> == libmawk
>
> - GPLv2
> - Awk
> - Compiles to around 225k. Too big...
>
> == libfawk
>
> - 3-clause BSD
> - Uses bison+yacc...
> - Awk; As it turns out, this has parentheses for function calls.
> - Compiles to around 24-30k. Not sure how to remove builtins.
> - Test suite (in-language). No fuzzing.
> - Tutorial book. No function reference.
> - No comments
> - Around 2-4k LoC
>
> == MicroPython
>
> - MIT
> - Python (but included for completeness)
> - Compiles to around 300k. Too big...
>
> == mruby/c
>
> - 3-clause BSD
> - Ruby
> - Compiles to around 85k without builtins and 120k with. Too big...
>
> == eLua
>
> - MIT
> - Lua
> - Build system is a royal pain (custom and written in Lua with external deps)
> - Base binary is around 250KiB and I don't want to deal with reducing it
>
> So the interesting/viable ones are
> - lil
> - picol
> - libfawk (maybe)
>
> I started with LIL because it was the smallest. I have found several
> issues with LIL along the way. Some of these are addressed in this series
> already, while others remain unaddressed (see the section "Future Work").
Thanks for the evaluations, of these, lil does make the most sense.
--
Tom
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 659 bytes
Desc: not available
URL: <https://lists.denx.de/pipermail/u-boot/attachments/20210701/cbacdf93/attachment.sig>
More information about the U-Boot
mailing list