You are here

Benchmarking the Roster

In my previous post I mentioned degraded performance when using Treectrl's own tag mechanism instead of keeping an external array. Since the roster widget is the most time critical one, perhaps next to the disco tree, I had worries about this.

The following benchmark tests were performed on my ancient PowerBook G4/667MHz using a simple script:


set t0 [clock clicks -milliseconds]
for {set i 0} {$i < 1000} {incr i} {
    ::RosterPlain::CreateItem test${i}@localhost available
}
puts "milliseconds: [expr [clock clicks -milliseconds]-$t0]"

but for any roster style you can use ::RosterTree::StyleCreateItem instead of the RosterPlain call. I executed the script after logging on with just four users so I could get rid of the fake users by just logging out.

There is a lot of numbers here so hold on. Three factors affect the performance, Treectrl version 2.1 or 2.2, using Treectrl tags (denoted Treectrl tags) or separate array to map tags to item, and if the number on directory title "(#)" is set for each item being added or batched as an idle command (just called once for 1000 users). Note above that I add 1000 users. This is not the complete story since I don't consider xml parsing into the tree structure, presence handling and all the hooks that are being run, or sounds and other alerts. So it's not the complete story but should pretty much cover all GUI parts.

So I started out with the then current state (0.95.17) with Treectrl 2.2, tags from Treectrl, and batched (#): 3.7 secs. Next I got 0.95.16 sources which have Treectrl 2.1, array tags, no (#) batched: 24.3 secs! So I took 0.95.17 again and switched of the batched (#): 53.9 secs! Seems I'm tracking down an "issue" here. Next step was to use 0.95.17 again, add back the batched (#), and port the array tags from 0.95.16 while keeping the Treectrl tags as backup code: 3.1 secs.
The gain is not so big here as in my previous post but I haven't separated out the diff between Treectrl 2.1 and 2.2. The latter needs much less memory I've read, and it is common to have some trade off between memory and cpu.

A bit of a mystery is that if I repeat the process, just logging out to clear the roster, logging in again, the numbers increase each time: 3.1, 4.1, 4.8 secs. I don't know why this happens.

In any case I've reached a speedup of a factor approx 7.8!

It would be really interesting to see what happens in real situations with real users. Xml parsing is a bottleneck I'm aware of.