Unified Results Format Project
It is desirable to produce results from tests in a unified format, so that they can be easily compared, exported to other formats (like HTML, XML, Excel, and PDF).
For this we should use JSON format. It is simple to produce, especially from python, which is one of the core languages in Fuego.
Existing work [edit section]
Benchmark parser.py [edit section]
Benchmark parser.py already produces results in the file res.json.
AGL create_xml... [edit section]
The AGL-JTA group did a unified output for their tests, using a script for each test named: create_xml_<test_name>.py.
Functional parser.py (prototype) [edit section]
Tim created a prototype script (based on this) to produce fref.json instead, called 'parser.py'.Problems encountered:
- Some global values were parsed from the AGL console log, that are not in the Fuego log
- items:
- BOARD_VER
- DEVICE_TEST
- DEVICE_FS
- report -> for test_dir and command_line
- the current patterns expect 'set -x'
- they should work whether 'set -x' is used or not
- it should be pretty easy to add these items to Fuego console log output
- items:
- There was a lot of duplication.
- it would be better to use a parser library (like the benchmark parser does)
- and use a lot of smaller scripts, instead of duplicating the whole script everywhere
- can we just copy the model of benchmark parser.py
- actually, that one could also be simplified.
- there are parts that could be in common that are not.
- actually, that one could also be simplified.
logrun files for JTA reports [edit section]
This was a json file created when a batch program was run. It collected all the logrun entries for the sub-tests run during a batch job and then produced a single PDF report - converting the json to xml, then tex, then PDF)See:
Current schema [edit section]
run.json [edit section]
Located in the log directory. Created by the Fuego core scripts in post_test.Here is an example:
{ "built_on": "docker", "cause": "unknown", "description": "unknown", "device": "docker", "duration": 4331, "files": [ "devlog.txt", "syslog.before.txt", "syslog.after.txt", "testlog.txt", "consolelog.txt" ], "host": "", "keep_log": "true", "num": "2", "reboot": "false", "rebuild": "true", "result": "SUCCESS", "start_time": 1491954848129, "target": "docker", "target_precleanup": "true", "test_name": "Benchmark.Whetstone", "testplan": "", "timestamp": "2017-04-11_23-54-08", "workspace": "/fuego-rw/buildzone" }
res.json [edit section]
Located in log directory. Created by Benchmark parser.py.res.json has schema as follows:
- a single dictionary with a set of metric names and values.
Here is an example for Benchmark.ffsb (more complicated than most)
{ "Main.append_fsync.TPercent": "1.393", "Main.create_fsync.TPercent": "22.452", "Main.delete.OpWeight": "9.302", "Main.create_fsync.Transactions": "15477", "Syscall_latency.write.Min": "0.000000", "Syscall_latency.open.Max": "55.584999", "Main.open_close.Transactions": "91", "Main.create_fsync.OpWeight": "9.725", "Main.stat.TPS": "8.11", "Main.delete.TPercent": "0.128", "Main.open_close.TPS": "8.03", "Syscall_latency.close.Avg": "0.002518", "Main.delete.TPS": "7.76", "Main.append.TPS": "89.08", "Main.append.Transactions": "1010", "Main.writeall.OpWeight": "10.888", "Main.writeall.Transactions": "13750", "Syscall_latency.read.Avg": "0.001175", "Main.create.OpWeight": "10.782", "Syscall_latency.open.Min": "0.002000", "Syscall_latency.unlink.Min": "0.016000", "Syscall_latency.write.Avg": "0.002091", "Main.writeall_fsync.Transactions": "13976", "Main.stat.TPercent": "0.133", "Main.stat.OpWeight": "9.725", "Main.stat.Transactions": "92", "Main.readall.OpWeight": "9.619", "Syscall_latency.unlink.Max": "3.539000", "Main.writeall.TPS": "1212.78", "Main.append.TPercent": "1.465", "Syscall_latency.read.Max": "0.440000", "Main.create.TPercent": "25.646", "Main.create.TPS": "1559.32", "Main.create.Transactions": "17679", "Syscall_latency.read.Min": "0.000000", "Main.readall.Transactions": "5812", "Syscall_latency.write.Max": "10.133000", "Throughput.Read": "2000.00", "Syscall_latency.unlink.Avg": "0.188614", "Main.writeall_fsync.OpWeight": "9.514", "Main.open_close.OpWeight": "9.619", "Syscall_latency.stat.Avg": "0.005761", "Throughput.Write": "21700.00", "Syscall_latency.open.Avg": "0.172316", "Main.writeall_fsync.TPS": "1232.71", "Syscall_latency.close.Max": "0.014000", "Main.append.OpWeight": "10.677", "Syscall_latency.stat.Max": "0.019000", "Syscall_latency.close.Min": "0.000000", "Main.readall.TPercent": "8.431", "Main.readall.TPS": "512.63", "Syscall_latency.stat.Min": "0.002000", "Main.writeall_fsync.TPercent": "20.274", "Main.delete.Transactions": "88", "Main.writeall.TPercent": "19.946", "Main.create_fsync.TPS": "1365.10", "Main.append_fsync.OpWeight": "10.148", "Main.open_close.TPercent": "0.132", "Main.append_fsync.Transactions": "960", "Main.append_fsync.TPS": "84.67" }
fres.json [edit section]
Located in log directory.Produced by Functionaly parser.py. Currently only one that exists is for Functional.curl, as a quick prototype based on create_xml_curl.py.
Schema looks like this:
- report
- name
- starttime
- end
- result
- board_version (missing)
- device (missing)
- filesystem (missing)
- test_dir (missing)
- command_line (missing)
- items = list of:
- name
- result: [PASS|FAIL]
test_result.xml [edit section]
Located in build directory.- report
- starttime
- endtime
- result
- board_version
- device
- filesystem
- test_dir
- command_line
- items = list of:
- name
- result: [PASS|FAIL]
To Do list [edit section]
- add AGL items to Fuego log
- add parser.py to all tests
- decide whether to use the same name or different one?
- if they produce similar output, keep the same name
- decide whether to use the same name or different one?
- add AGL items to res.json output, to match fres.json?
- refactor common.py and dataload.py to share common functions better
- convert programs that use test_result.xml to use test_result.json??
- compare our formats with other formats in the industry
Desired reports [edit section]
What report does AGL produce now with their XML?What format is it in?
Meeting notes - 2017-04-13 [edit section]
Here are meeting notes:
- Song has already refactored the functional results scripts, and produces JSON
- these are for individual test results
- Daniel is already working on a results.json file, that replaces plot.data.
- you can specify to ignore a field in the results
- Tim is going on vacation for a week
- Song's code should be available next week for review
- We agreed to continue work in our respective areas, and try to coordinate the code development on the mailing list
- We talked a little bit about the additions to ftc to produce reports
- Daniel plans to add a button to flot to produce an HTML report for a view of the data.
- Can add support to ftc to produce PDF as well
Tim's view:
- Song's code is useful for parsing the individual functional test results
- Each functional test needs to specify the patterns to parse from the test program logs
- this is good for producing the results for a single run
- Daniel's code is useful for collating the data from multiple runs into a single file (for plotting and reports)
results.json [edit section]
Proposal by Daniel. We can reuse it for Functional.
In case of a functional test, we can use the "groupname" to define a group of tests, and the "test" property to define the specific test. The "data" array would contain values 0 (passed) or 1 (failed), and the "ref" array would contain the expected result (0 or 1). The rest is just metadata such as timestamps, the board name, the platform (I have to fix this), the spec used or the fimrware version (i.e. the kernel version).
{ "docker-nopread-4.4.0-70-generic-2048_Kb_Record_Write-Write":{ "groupname":"2048_Kb_Record_Write", "platform":"fake", "board":"docker", "timestamp":[ "2017-04-10_02-38-22", "2017-04-12_06-22-29", "2017-04-12_06-32-26", "2017-04-12_06-33-28", "2017-04-12_06-34-14", "2017-04-12_07-59-13", "2017-04-13_02-36-36", "2017-04-13_03-01-20", "2017-04-13_03-01-43", "2017-04-13_03-02-27" ], "test":"Write", "ref":[ "1", "1", "1", "1", "1", "1", "1", "1", "1", "1" ], "data":[ "114339", "106576", "178903", "162865", "108242", "233584", "196771", "128195", "104828", "140435" ], "spec":"nopread", "fwver":"4.4.0-70-generic" }, "zynq-nopread-4.4.0-xilinx-gdfb97bc-2048_Kb_Record_Write-Write":{ "groupname":"2048_Kb_Record_Write", "platform":"fake", "board":"zynq", "timestamp":[ "2017-04-10_02-46-03" ], "test":"Write", "ref":[ "1" ], "data":[ "21635" ], "spec":"nopread", "fwver":"4.4.0-xilinx-gdfb97bc" }, "zynq-nopread-4.4.0-xilinx-gdfb97bc-2048_Kb_Record_Write-ReWrite":{ "groupname":"2048_Kb_Record_Write", "platform":"fake", "board":"zynq", "timestamp":[ "2017-04-10_02-46-03" ], "test":"ReWrite", "ref":[ "1" ], "data":[ "44629" ], "spec":"nopread", "fwver":"4.4.0-xilinx-gdfb97bc" }, "docker-nopread-4.4.0-70-generic-2048_Kb_Record_Write-ReWrite":{ "groupname":"2048_Kb_Record_Write", "platform":"fake", "board":"docker", "timestamp":[ "2017-04-10_02-38-22", "2017-04-12_06-22-29", "2017-04-12_06-32-26", "2017-04-12_06-33-28", "2017-04-12_06-34-14", "2017-04-12_07-59-13", "2017-04-13_02-36-36", "2017-04-13_03-01-20", "2017-04-13_03-01-43", "2017-04-13_03-02-27" ], "test":"ReWrite", "ref":[ "1", "1", "1", "1", "1", "1", "1", "1", "1", "1" ], "data":[ "313702", "246395", "419790", "669913", "311879", "269188", "228319", "299025", "241953", "235479" ], "spec":"nopread", "fwver":"4.4.0-70-generic" } }