Release 1.2 testing - 2017-08-24 in 'raw' format
{{TableOfContents}} Here is some information about testing started on August 24, 2017 for the 1.2 release: .(make room for TOC on right) . . . . . . = Tests conducted = - min1.testplan_ltsi {{{ Benchmark.Dhrystone 2017-08-24_18:09:14 timdesk:min1 PASS Benchmark.IOzone 2017-08-24_19:20:19 timdesk:min1 PASS Benchmark.Java 2017-08-24_18:10:37 timdesk:min1 UNFINISHED Benchmark.Whetstone 2017-08-24_18:07:07 timdesk:min1 PASS Benchmark.bonnie 2017-08-24_19:20:50 timdesk:min1 UNFINISHED Benchmark.dbench 2017-08-24_19:21:09 timdesk:min1 PASS Benchmark.ffsb 2017-08-24_18:07:59 timdesk:min1 PASS Benchmark.fio 2017-08-24_18:10:49 timdesk:min1 PASS Benchmark.hackbench 2017-08-24_18:07:39 timdesk:min1 PASS Benchmark.linpack 2017-08-24_18:16:56 timdesk:min1 PASS Benchmark.netperf 2017-08-24_18:31:37 timdesk:min1 PASS Benchmark.tiobench 2017-08-24_18:05:06 timdesk:min1 PASS Functional.LTP 2017-08-24_18:32:37 timdesk:min1 PASS Functional.aiostress 2017-08-24_18:17:35 timdesk:min1 UNFINISHED Functional.bc 2017-08-24_18:10:14 timdesk:min1 PASS Functional.bc 2017-08-24_18:17:54 timdesk:min1 PASS Functional.bzip2 2017-08-24_18:18:16 timdesk:min1 PASS Functional.crashme 2017-08-24_18:05:52 timdesk:min1 PASS Functional.expat 2017-08-24_18:18:35 timdesk:min1 UNFINISHED Functional.ft2demos 2017-08-24_18:19:05 timdesk:min1 PASS Functional.glib 2017-08-24_18:20:11 timdesk:min1 UNFINISHED Functional.hello_world 2017-08-24_18:14:26 timdesk:bbb PASS Functional.hello_world 2017-08-24_18:04:41 timdesk:min1 PASS Functional.hello_world 2017-08-24_18:06:49 timdesk:min1 PASS Functional.ipv6connect 2017-08-24_18:08:53 timdesk:min1 PASS Functional.jpeg 2017-08-24_18:06:22 timdesk:min1 PASS Functional.kernel_build 2017-08-24_18:22:57 timdesk:min1 PASS Functional.linus_stress 2017-08-24_18:27:33 timdesk:min1 PASS Functional.netperf 2017-08-24_19:18:06 timdesk:min1 PASS Functional.rmaptest 2017-08-24_18:28:33 timdesk:min1 PASS Functional.stress 2017-08-24_18:29:05 timdesk:min1 PASS Functional.synctest 2017-08-24_18:30:31 timdesk:min1 PASS Functional.zlib 2017-08-24_18:31:15 timdesk:min1 PASS }}} = Issues found = - testplan_ltsi - uses noroot spec for bonnie - is probably wrong - tests - min1.default.Benchmark.Java - aborted - error message in console log is not human friendly - error is an abort from assert_define, from fuego_test.sh:test_pre_check() - this is not using need_check - error is not put into run.json file - min1.default.Functional.aiostress - failed - test execution error - error loading shared libraries: libaio.so.1 - min1.default.Functional.expat - failed - build failure - directory (empty) was left on target - Warning: deprecated conversion from string constant to 'char*' [-Wwrite-strings] - youtube says this can be fixed with a line like this: - #pragma GCC diagnostic ignored "-Wwrite-strings" - min1.default.Functional.glib - failed - runtime error with glib test - execution of bash script returns exit code 143 - FUEGO has a bug handling the return code: - TRB: RETURN_VALUE='' - generic_parser.py, line 19 invalid literal for int() with base 10: '' - value = "PASS" if int(RETURN_VAULE) == 0 else "FAIL" - result of "report" is not being handed to generic_parser.py - min1.default.Functional.kernel_build - failed - fuego_test.sh uses 'report "cat build.log"', but the build.log is generated on the host, not the target. You can't 'report' that. - need to add build.log to target-side test log, if that's what's wanted - it's not using the right kernel config - min1.docker.Functional.LTP - takes a very long time (45 minutes) - build duration was 584 seconds (~10 minutes) - subtest fork13 takes 337 seconds (~5 minutes) - subtest getrandom02 takes 296 seconds (~4 minutes) - min1.noroot.Benchmark.bonnie failed - Fuego uses root account on min1 - should make bonnie recognize situation and run correctly - min1.default.Functional.LTP - min1.selectionwithrt.Functional.LTP - run.json (4.4) does not have posix results - run.json (4.4) does not have realtime results - status of 'syscalls' testset is PASS, even though there are failures in individual testcases (e.g. sysctl01) - min1.default.Benchmark.IOZone - there's no run.json link in the Jenkins interface - there are lots of HTML tables with SKIP in the measure area (not just the status area) - The test doesn't actually perform anything but: - 2048_Kb_Read_Write.ReWrite - 2048_Kb_Read_Write.Write - that's odd, are the parameters to iozone correct for the default spec? - Jenkins interface - results on Jenkins job page: - HTML table - test_plan is always "none" - start_time is in seconds since the epoch (not human friendly) - duration is in milliseconds (not human-friendly) - measures are reported as PASS instead of valid value - see Dhrystone Score, or linpack.linpack - ftc gen-report gets this right - Functional.LTP has too much data!! - plots for benchmarks - are completely missing - ftc - gen-report - columns are messy (columns don't auto-adjust to long field entries) - ftc gen-report --where test=.*LTP - tguid:status for syscalls are all None (grep fork13) - run.json has correct status (PASS) - general - failures are very difficult to diagnose - some tests (like LTP) hid the test output - interpreting the console log is very difficult == issues table == {{{#!Table:test1 show_edit_links=0 show_sort_links=0 ||Test||problem||notes|| ||Benchmark.Java||abort not recorded||should put abort message in run.json file|| ||Functional.aiostress||missing libaio.so.1||is a toolchain/target mismatch|| ||Functional.expat||build error - warning about const char conversion||Looks like maybe a warning that turned into an error?|| ||Functional.glib||runtime error - gtester returns failure, and not the expected output||.|| ||Functional.kernel_build||error in fuego_test.sh - build.log is not on the target||Need to change build.log|| ||Functional.LTP||slow, result are too big for Jenkins page||.|| ||Functional.LTP||run.json test_set status is wrong||Should be fail, with missing criteria.json file|| ||Functional.LTP||run.json missing posix results||.|| ||Functional.LTP||run.json missing realtime results||.|| ||Functional.LTP||gen-report shows wrong values for tguids (None instead of PASS/FAIL)||.|| ||Benchmark.bonnie||noroot spec didn't work||switched to default spec, should handle root or not in test|| ||Benchmark.Dhrystone||Dhrystone.Score has PASS in Jenkins HTML table||.|| ||Benchmark.IOzone||only has values for 2 measures||.|| }}} == Functional.glib problem details == - on min1 and docker, gtester returns with exit code 142, which causes Fuego to freak out - fuego_test.sh test_run calls "report" - it returns 143, which is an error - the signal handler is called - fuego_test.sh test_processing is empty (normally it should have a log_compare) - the log_compares in this test don't match the test output - has this test every worked??? - post_test calls: - processing, which calls - test_processing(), which just returns 'true' - then it calls parser.py or generic_parser.py - if there were no errors, then RETURN_VALUE is undefined - on bbb, the build fails, trying to compile gatomic.c - The assembler says: - Error: selected processor does not support Thumb mode 'swp r3,r5,[r4]' == Functional.LTP problem details == === run.json: no posix results === fuego-core/engine/tests/Functional.LTP/parser.py does not even parse posix test results. Should look at JTA or Fuego 1.0 and see what Functional.LTP-posix did. Maybe just logcompare? === run.json: no realtime results in file === === run.json: incorrect status for test_set and test_suite == run.json has overall status of PASS and test_set syscalls has status of PASS, when some testcases under syscalls failed (and there's no criteria.json file - the default should be max_fail=0) Output from min1.selectionwithrt.Functional.LTP consolelog.txt: Need to not show parsed_results - it's too big with LTP. {{{ Warning: missing or faulty criteria.json file, looking for reference.log Applying criterion {'max_fail': 0, 'tguid': 'syscalls'} ^^^^^ this should result in FAIL ^^^^^ Applying criterion {'max_fail': 0, 'tguid': 'mm'} Applying criterion {'max_fail': 0, 'tguid': 'timers'} Applying criterion {'max_fail': 0, 'tguid': 'fs'} Applying criterion {'max_fail': 0, 'tguid': 'pty'} Applying criterion {'max_fail': 0, 'tguid': 'pipes'} Applying criterion {'max_fail': 0, 'tguid': 'sched'} Applying criterion {'max_fail': 0, 'tguid': 'dio'} Applying criterion {'max_fail': 0, 'tguid': 'ipc'} Applying criterion {'max_fail': 0, 'tguid': 'Functional.LTP'} Writing run data to /fuego-rw/logs/Functional.LTP/min1.selectionwithrt.4.4/run.json reference.json not available }}} = Things to check for = - failures are easy to diagnose - no, it requires a ton of Fuego knowledge - I can skip LTP tests that cause problems - I can save results to establish new baseline for benchmark thresholds - save as criteria file? = Ideas for fixes = - Benchmark.Java should put abort message into run.json file - for ftc gen-report column sloppiness: - do two passes over data to get field lengths for column widths - need to collect all data first, and run over it twice - Benchmark.bonnie should autodetect if running as root, and use correct arguments to bonnie++ program - should add "duration_hint" to spec.json file, so each spec can specify a hint - idea is that this number could be multiplied by a board variable (timeout_factor?) to yield an approximate timeout for the test with this spec. The number in the testplan would override this guess, but this would be better than nothing. = issues fixed = * ftc list-runs does not process '!=' where operator * ftc list-runs help does not document '!=' where operator * ftc gen-report help does not document '!=' where operator * pass smaller number to fork13 command line for LTP time reduction