Issue 0072
- Summary
- invalid literal exception from sercp on some operations
- Owner
- Tim
- Reporter
- Manoj Tiwary
- Status
- open
- Priority
- high
Description [edit section]
The serial transport is basically non-functional under some conditions.Specifically, sometimes sercp will fail with a python exception, with a string like the following:
ERROR: invalid literal for int() with base 10: "root@beaglebone:~# [ -d '/home/a/fuego.Functional.fuego_transport/dir2' ] ; echo \r $? \r\n"
You may also see the following from sersh:
ERROR: invalid literal for int() with base 10: '[ [&&serio_cmd_done&&]]'
This should be investigated and fixed.
One way to trigger this is by running the job: beaglebone-serial.default.Functional.fuego_transport. It appears to reliably trigger on the 5th testcase in that test (which is the testcase for 'test multiple recursive dir put'
This was duplicated by Manoj Tiwary on a raspberry pi 3 board on this same testcase in Functional.fuego_transport, as well as on the same board in the deploy phase of Functional.LTP.
Note that Manoj's system uses systemd and default shell 'dash'. On Tim's system, the beaglebone black also uses systemd and 'dash.
For a full log showing this issue, see: file:fuego_transport-console-log-serial-invalid-literal.txt
Notes [edit section]
According to /usr/local/bin/serio (inside the docker container), this is caused by some kind of race condition between the board and serio. Search for a comment in that file with the word "literal" in it.So this appears to be a well-known bug.
Note that this appears to happen when sercp (serio) is used for recursive directory copies. This will cause serio to internally issue a series of put_files internally, which are likely happening too fast. A single put_file in isolation usually does not cause the problem. Putting a sleep in Fuego did not solve the problem because the issue was serio doing the loop internal to itself.
There must be some state on the serial port that is not flushing before serio reads the next thing (e.g. echos that are still pending). Possibly adding a delay to the internal put_file loop will solve the problem.
See recursive_put(), especially line 485 in serio.
Maybe a call to prime_before_first_execute() inside the put_file loop will solve the problem. This is used to re-sync serio with the shell on the target side.
- backlink