An important part of benchmarking is to draw graphs. A graph can reveal results you wouldn't have spotted just by looking at raw numbers. By the way, the process of massaging the raw numbers into graphs will often reveal things too.
Sysbench output tends to be quite wordy, especially when you have a script that runs 1, 2, 4, 8... threads with the same test. To manually copy paste the numbers into a spreadsheet is tiresome. So I came up with this monster shell one-liner to condense the output into a csv file. I'm posting it here so I will find it the next time I need it:
cat sysbench.log | egrep " cat|threads:|transactions|deadlocks|read/write|min:|avg:|max:|percentile:" | tr -d "\n" | sed 's/Number of threads: /\n/g' | sed 's/\[/\n/g' | sed 's/[A-Za-z\/]\{1,\}://g'| sed 's/ \.//g' | sed -e 's/read\/write//g' -e 's/approx\. 95//g' -e 's/per sec.)//g' -e 's/ms//g' -e 's/(//g' -e 's/^.*cat //g' | sed 's/ \{1,\}/\t/g' > sysbench.csv
Example input:
[sqlclust@esitbi126lab sysbench]$ cat mysql51
sysbench 0.5: multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 2
Random number generator seed is 0 and will be ignored
Threads started!
OLTP test statistics:
queries performed:
read: 702226
write: 200636
other: 100318
total: 1003180
transactions: 50159 (334.38 per sec.)
deadlocks: 0 (0.00 per sec.)
read/write requests: 902862 (6018.90 per sec.)
other operations: 100318 (668.77 per sec.)
...
Example output:
mysql51,
2,50159,334.38,0,0.00,902862,6018.90,4.48,5.98,24.08,6.88
8,159546,1063.60,0,0.00,2871828,19144.76,5.25,7.52,24.88,9.74
16,243102,1620.59,0,0.00,4375836,29170.55,6.04,9.87,646.63,13.62
...
Explanation:
egrep " cat|threads:|transactions|deadlocks|read/write|min:|avg:|max:|percentile:"
List of words for the statistics I want to filter. "cat" is to find the line which is a delimiter between runs. In my case I produce one big logfile of sysbench output, where different runs are delimited with a shell prompt and commands "cat mysql51" or "cat mysql55". The filenames mysql51 and mysql55 are kept as descriptive names for the csv tables. If you collect statistics in some other format, you could replace cat with some other keyword.
tr -d "\n" | sed 's/Number of threads: /\n/g' | sed 's/\[/\n/g'
Delete all newlines. Add a newline in front of the number of threads. Add a newline in front of the shell command prompt. (The last regexp could really also be just "cat", I suppose.)
sed 's/[A-Za-z\/]\{1,\}://g'| sed 's/ \.//g'
Remove all text followed by colon, such as "deadlocks:". Remove a few stray dots that are not a decimal dot but just floating around amidst white space.
sed -e 's/read\/write//g' -e 's/approx\. 95//g' -e 's/per sec.)//g' -e 's/ms//g' -e 's/(//g' -e 's/^.*cat //g'
Remove more text not caught by the previous.
sed 's/ \{1,\}/\t/g'
Replace whitespace with a tab "\t". If you want your csv delimiter to be something else, use that character here instead of \t.
- Log in to post comments
- 28486 views
I had a little awk one-liner
I had a little awk one-liner somewhere that would do something similar (I think awk is easier to use than grep/sed for this). I don't like sysbench's output. Maybe we should file a bug report to improve it.
The other thing you always
The other thing you always need to do for sysbench is to write a script that will run it in a loop for an increasing number of threads. If you combine that and a pretty output formatter, I could see this in maatkit (or aspersa, depends on if you see sysbench as a Linux or MySQL tool...)
Thank you for sharing!
Thank you very much for for sharing this, I had a very very lengthy time compiling sysbench results for charting months ago. This will make the job alot easier :)
Of course as Baron said, having better or alternative output formats for sysbench would be great as well.
sysbench testing framework
Henrik, good one-liner there. In regard to your statement "The other thing you always need to do for sysbench is to write a script that will run it in a loop for an increasing number of threads." - this has already been taken care of and will even generate graphs from the sysbench output which you can then load as HTML (it generates CSV and the required HTML+JS to view the graphs). See the google code here: http://code.google.com/p/quadrant-framework/
There are a lot of features for loop control, including the ability to run os-system commands in between loops, control threads increments, change mysql dynamic variables in between loops, and you can save your tests as templates for later use.
This sounds like something I
This sounds like something I need to look at. Including drawing the graphs directly. Thanks for the tip!
Very nice
Thanks for sharing this oneliner. Just started investigating sysbench and was going to start working on a solution. Glad you and Google beat me to it.
Also note the quadrant
Also note the quadrant framework in previous comment. Didn't try it myself yet, but if it even draws the graphs for you, sounds like heaven!
Try this one.
Thanks, for inspiring, modified a bit as per my use case, try and let me know.
cat sysbench.log | egrep " cat|threads:|transactions|queries:|deadlocks|read:|write:|ignored errors:|reconnects:|total time:|min:|avg:|max:|percentile:"|awk -F':' '{print $2}'|sed 's/[a-zA-Z(]//g'|sed 's/.)//g'|xargs -n16|cut -d' ' -f1-8,11-|sed 's/ /,/g'|sed '1s/^/Threads,Reads,Writes,Transactions,TPS,Queries,QPS,Ignored Errors,Reconnects,Total Time,Min Lat,Avg Lat,Max La,95th Pct Lat\n/'