Ruby 1.8 vs Ruby 1.9 - the slightly different benchmark
I think there are about one million trillion benchmarks out there that compare ruby 1.8 with ruby 1.9. And about any one of them has a different judgement on things. So I figured hey I’m going to give it a look too.
First of all, what I’m looking at? Nope not the speed of how things execute today, but what it execute. It has to do with some nifty tool I found called SytemTap it offers some utilities to trace what a application is doing. For example, on what I’m looking here, what system calls it performs. This can be quite nifty to see where bottlenecks are.
Before we go into some details lets first think about what that means for the results. They are completely independent from the environment, that is good, meaning a second process running along side won’t influence the results. Then again at times the meaning of them can be quite questionable, admitted I have no clue what about 90% of them even do. A third point is that even a process with no system calls at all can be slow as hell.
Non the less there are some things that, at least for me have a quite clear meaning. so lets take a look at some simple examples.
I took the bm_app_mandelbrot.rb file distributed with ruby 19, jus because well I liked it, I know it’s not the most valid example but it’s output is nice and short and the results had been the same with about any other test I ran.
Ruby 1.9:
sys_munmap calls: 4 avg time (us): 27 total(us): 109 sys_mprotect calls: 12 avg time (us): 21 total(us): 257 sys_mmap calls: 39 avg time (us): 7 total(us): 297 sys_rt_sigaction calls: 32 avg time (us): 5 total(us): 178 sys_rt_sigprocmask calls: 11 avg time (us): 5 total(us): 57 sys_read calls: 19 avg time (us): 36 total(us): 690 sys_rt_sigtimedwait calls: 1 avg time (us): 7 total(us): 7 sys_brk calls: 17 avg time (us): 6 total(us): 105 sys_open calls: 18 avg time (us): 15 total(us): 276 sys_getcwd calls: 1 avg time (us): 8 total(us): 8 sys_getdents64 calls: 2 avg time (us): 69 total(us): 139 sys_getrlimit calls: 6 avg time (us): 5 total(us): 32 sys_getpgrp calls: 1 avg time (us): 5 total(us): 5 sys_getppid calls: 1 avg time (us): 5 total(us): 5 sys_getpid calls: 1 avg time (us): 5 total(us): 5 sys_getegid calls: 3 avg time (us): 4 total(us): 14 sys_geteuid calls: 5 avg time (us): 5 total(us): 25 sys_getgid calls: 2 avg time (us): 5 total(us): 10 sys_getuid calls: 2 avg time (us): 5 total(us): 11 sys_lseek calls: 3 avg time (us): 5 total(us): 16 sys_ioctl calls: 5 avg time (us): 7 total(us): 38 sys_arch_prctl calls: 2 avg time (us): 5 total(us): 10 sys_newstat calls: 15 avg time (us): 46 total(us): 698 sys_newfstat calls: 15 avg time (us): 5 total(us): 80 sys_futex calls: 1 avg time (us):13408 total(us): 13408 sys_fcntl calls: 1 avg time (us): 5 total(us): 5 sys_faccessat calls: 13 avg time (us): 6 total(us): 86 sys_select calls: 5 avg time (us): 7 total(us): 36 sys_set_tid_address calls: 1 avg time (us): 5 total(us): 5 sys_clone calls: 1 avg time (us): 19 total(us): 19 sys_close calls: 21 avg time (us): 5 total(us): 125 sys_uname calls: 3 avg time (us): 5 total(us): 17
Lets have a look, there are a hell of a number of words that are hard to understand, but lets just focus on 2 things,
All calls are pretty much even, in the number of calls, okay even with a difference of 38 calls. It is not per se a good thing but once we look ahead we’ll see that, at least it’s not a bad thing.
The second interesting thing is that the call sys_futex took the most time, which makes sense as it is a method used to wait for a memory access so I’d say it’s nothing unusual.
Now it gets interesting, lets have a look at ruby 1.8:
sys_munmap calls: 6 avg time (us): 30 total(us): 182 sys_mprotect calls: 11 avg time (us): 20 total(us): 221 sys_mmap calls: 40 avg time (us): 7 total(us): 295 sys_rt_sigaction calls: 30 avg time (us): 5 total(us): 151 sys_rt_sigprocmask calls:2022790 avg time (us): 5 total(us):10252567 sys_read calls: 20 avg time (us): 11 total(us): 227 sys_brk calls: 22 avg time (us): 6 total(us): 148 sys_open calls: 21 avg time (us): 10 total(us): 222 sys_getrlimit calls: 2 avg time (us): 5 total(us): 10 sys_getpgrp calls: 1 avg time (us): 5 total(us): 5 sys_getppid calls: 1 avg time (us): 5 total(us): 5 sys_getpid calls: 1 avg time (us): 5 total(us): 5 sys_getegid calls: 3 avg time (us): 4 total(us): 14 sys_geteuid calls: 3 avg time (us): 5 total(us): 15 sys_getgid calls: 2 avg time (us): 5 total(us): 10 sys_getuid calls: 2 avg time (us): 5 total(us): 10 sys_lseek calls: 2 avg time (us): 5 total(us): 10 sys_arch_prctl calls: 2 avg time (us): 5 total(us): 11 sys_newstat calls: 21 avg time (us): 11 total(us): 249 sys_newfstat calls: 15 avg time (us): 5 total(us): 81 sys_faccessat calls: 13 avg time (us): 6 total(us): 87 sys_set_tid_address calls: 1 avg time (us): 5 total(us): 5 sys_close calls: 24 avg time (us): 5 total(us): 136 sys_uname calls: 3 avg time (us): 5 total(us): 16
Okay here sys_rt_sigprocmask is most interesting, alone cause it is called like 2 million times o.O I think we can agree that there is a difference to ruby 1.9 even without exactly knowing what the heck it is doing. So I did a bit of research cause that kind of jumped in my attention, it sets some kind of flags for the process.
Doesn’t sound so bad does it? Now you’ll love to hear this: about most of them do exactly the same, and when I understood it correctly the call should not do anything in that case.
A little quote:
rt_sigprocmask changes the list of currently blocked signals. The set value stores the signal mask of the pending signals. The previous ac- tion on the signal is saved in oact. The value of how indicates how the call should behave; its values are as follows: SIG_BLOCK The set of blocked signals is the union of the current set and the set argument.Now I modified the SystemTap script a bit to show me what kind of sys_rt_sigprocmask calls are made and it reveals that the majority of them use SIG_BLOCK and a null pointer for the new set. Now when I understand this call correct this is a completely useless call to remember the man page, ‘the set of blocked signals is a union’ and some basic math one might get the feeling ‘a union with an empty set is the set itself’
What does that say? Honesty in full extend that is pretty much open to interpretation. If you’d ask me, it says that there is no question a huge improvement in the code from 1.8 to 1.9.
If someone has a nice long script that runs out of the box and is not some kind of serve that will run forever, please let me know. I’m still looking for other tests on this.
Trackbacks
Use the following link to trackback from your own site:
http://blog.licenser.net/trackbacks?article_id=42
