[tbot] How to debug acquire_linux

Harald Seiler hws at denx.de
Fri Nov 23 11:14:26 UTC 2018


Hi Stefano,

On Fri, 2018-11-23 at 11:22 +0100, Stefano Babic wrote:
> Hi Harald,
> 
> On 23/11/18 10:46, Harald Seiler wrote:
> > Hello Stefano,
> > 
> > first of all, so we are all on the same page.  The implementation
> > for login is here:
> > 
> > 	https://github.com/Rahix/tbot/blob/master/tbot/machine/board/linux.py#L62
> > 
> > To break it down:
> > 
> > 1. Wait for the login prompt ("login: ") to appear at the END
> >    of the stream (L67)
> 
> Prompt was detected
> 
> > 2. Send the username + "\n" (L68)
> > 3. If a password is configured, wait for the password prompt,
> >    at the END of the stream (L73)
> 
> There is no password
> 
> > 4. Send the password + "\n"
> > 5. Wait for `login_wait` seconds for the shell to come up (L77)
> 
> With 1-2 seconds, it waits indefinetely.
> 
> I set now login_wait=10, it works. It is a long time, but it is still a
> work-around.

Hmm, this is interesting ... It might be something completely different:

Maybe the shell isn't running yet when TBot sends it's initialization strings.
So they end up nowhere and TBot then waits indefinitely (at 6) ...

If it works with busybox, maybe their shell starts quicker?

> 
> > 6. Initialize the shell. Sends a bunch of commands to get the
> >    shell into a known state.  Then wait for the TBot prompt. (L79)
> > 
> > This process has 3 places that can block: 1, 3, and 6.
> > 
> 
> It looks to me that 1,3 are fine
> 
> > I can get behind allowing 1 and 3 to search for there prompt in the middle of
> > the stream as well, to circumvent the kernel-log clobber issue.
> > 
> > Number 6 needs a different solution though.  The reason is that once
> > TBot has acquired the shell, additional input from the serial console might
> > potentially mess up command outputs.  This will lead to a whole load of new issues,
> > so TBot should somehow prevent that in a reasonable way.
> 
> ok - that means it is reasonable to add "loglevel=0" (or at least =2) to
> all boards to be tested ?

Hmm, this isn't a great solution ... Because now, you won't see the log messages at all,
which is bad if something goes wrong ... I'd only add this if nothing else works ...

> > 
> > On Fri, 2018-11-23 at 10:02 +0100, Stefano Babic wrote:
> > > Hi Harald,
> > > 
> > > On 23/11/18 09:48, Harald Seiler wrote:
> > > > Hi Stefano,
> > > > 
> > > > On Thu, 2018-11-22 at 23:47 +0100, Stefano Babic wrote:
> > > > > Hi,
> > > > > 
> > > > > On 22/11/18 19:24, Stefano Babic wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > I have a board that cannot be got in linux. It looks like that
> > > > > > tbot.acquire_linux() hangs forever (can I set a tiimeout ?)
> > > > > > 
> > > > > > I ran a testcase that works fine for another board - I guess that the
> > > > > > issue is due to the fact that this board has a lot of output in part
> > > > > > caused by systemd and in part due to kernel messages that are printed to
> > > > > > the console. When I start the testcase, tbot remains in <board>-linux. I
> > > > > > can understand that board is up and works because software activates
> > > > > > some relays, but it is not recognized by tbot.
> > > > > > 
> > > > > > The weird thing is that "interactive_linux" works. How can I debug this
> > > > > > ? Or do we have to block kernel messages to the console via kernel
> > > > > > parameter ?
> > > > > > 
> > > > > 
> > > > > I have some more details. The not working board has "bash" as shell and
> > > > > TERM is set to "xterm". Running tbot with -vvv, after "root" as user,
> > > > > the bord answers with some terminal control characters. It looks like
> > > > > that tbot gets confused. If I replace "bash" with "busybox", it works.
> > > > 
> > > > Hmm, can you send me the output of -vvv? The only cases I know don't
> > > > work is when something is sent after the password prompt or once the
> > > > shell comes up ...
> > > > 
> > > 
> > > Ouch..I have already done, but I see that the ML was removed in the
> > > answer. I attach Heiko's answer, too.
> > 
> > Hmm, your output doesn't look bad to me at first glance ...
> > 
> > > 
> > > The thing is:
> > > 
> > >    ├─'login: '
> > 
> > The login prompt was successfully detected here, TBot sent the
> > username (so step 1 didn't block in this case):
> 
> Right
> 
> > 
> > > │   │    <> CCU-1 login: root
> > > │   ├─'root\n\x1b7\x1b[r\x1b[999;999H\x1b[6n'
> > > │   ├─'\x1b'
> > > │   ├─'8'
> > 
> > These control sequenced are ignored
> 
> Well, this is the more evident difference with busybox. With busybox,
> just '#' is returned.
> 
> > 
> > > │   ├─'root at CCU-1:~'
> > > │   ├─'# '
> > 
> > This prompt is also ignored ... If tbot hangs here, I can think of two things
> > that might have happened:
> > 
> > 1. You set a password and tbot rightfully waits for the password prompt
> 
> No.
> 
> > 2. Something went wrong with the shell initialization
> 
> Yes.
> 
> > 
> > 	https://github.com/Rahix/tbot/blob/master/tbot/machine/channel/channel.py#L229
> > 
> > The first blocking call happens in line 240, where the prompt is set. `raw_command`
> > sends something like `PS1="SOME_TBOT_PROMPT> "\n` and then waits for the prompt to
> > appear again.  Maybe some kernel message happend just after the prompt returned?
> 
> But then we should see them in the log, or not ?
> 
> But yes, in last tests I remove again loglevel=0, and kernel messages
> can appear.
> 
> This could be a reason why increasing login_wait helps. After some
> seconds, board is surely in a idle state without kernel messages.
> 
> > 
> > What irritates me is that we should definitely see this.  `raw_command` calls
> > `read_until_prompt`, which is the place where the -vvv debug output is generated ...
> 
> Agree.
> 
> > 
> > > So there are some control characters (I have not decoded them) just if
> > > the target runs the bash shell. With busybox, it works and tbot completes.
> > > 
> > > 
> > > > You could try defining `login_wait`[1] in your BoardLinuxMachine
> > > > and see if that helps ...
> > > 
> > > This does not help, I have already tried.
> 
> Note: I tried with login_wait=1 and login_wait=2. login_wait=10 helps.
> > > IMHO the login should be done more resistant - I proposed to try to send
> > > some CR in a loop to check if login was successful in my previous mail.
> > 
> > Yes, it definitely should.  The solution to just send ^C in a loop is not
> > one I like, however.  In my opinion, we should avoid "brute-forcing" the
> > board into a certain state, as this can have lots of uncertainties.  As mentioned
> > before, as soon as we have the shell, we can't allow clobber anyway ...
> 
> ok
> 
> > 
> > > > 
> > > > > The second not working case seems if the board prints some data on the
> > > > > console after the login propt is sent. This is the case when a daemon
> > > > > has the stdout set to the console. Also in this case, it seems that tbot
> > > > > gets confused.
> > > > 
> > > > Yes, this is an open issue:
> > > > 
> > > > 	https://github.com/Rahix/tbot/issues/1
> > > > 
> > > > I don't know how we best handle this case ... Do you have any ideas?
> > > 
> > > I think there are more use cases. One is the kernel messages. Kernel
> > > messages can be turned off at boot time with loglevel=0 (but they can be
> > > read via dmesg). This is a suitable way.
> > > 
> > > My use case is an application (SWUpdate) in a small system (rescue
> > > system). It makes sense to redirect application stdout to the console,
> > > becaus a rescue does not have many options. It should at least possible
> > > to login and to collect the output, while a testcase will push an image
> > > from network (and then, there is no exec on the board).
> > 
> > Is shell and application output interleaved
> 
> They are interleaved.

Well, this is really bad ...

> > in this case or is
> > it in a well defined order?
> 
> Regards,
> Stefano
> 
> 
-- 
Harald

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: +49-8142-66989-62  Fax: +49-8142-66989-80   Email: hws at denx.de


More information about the tbot mailing list