After I posted my technique for using the Flash bitmap drawing methods to perform calculations on arrays, Tom Carden started posting a few test of his own that tried to optimize some code-only algorithms. In his test he got an algorithm that used a hash table lookup method to be almost as fast as the bitmap method (but in his tests the hash table method was still slower than using bitmaps). I’ve created a slightly modified version of a hashmap algorithm that squeaks out just ahead of the bitmap test.
The algorithm basically just uses a very simple Array that holds boolean values that represent whether or not a particular item exists in the filtered array. So this is just like using the bitmap approach, except instead of creating different colored pixels, you just set the proper index in the array to true or false. Then once you have those two array of true/false values for both your filtered arrays, it’s just a matter of looping over the full array and checking the hash table arrays to see if a particular item exists in one, both, or neither of the filtered arrays.
And it turns out this is faster (but not by that much) than the bitmap method. But if you’re looking for the fastest way to take two arrays and find the unique values in either or the intersection or union, using this hash table method is the fastest way I’ve found so far. Too bad the bitmap stuff didn’t win, I was rooting for it… oh well, so it goes.
I’ve updated the test application (source) to include this new algorithm too, and here are the performance test results on my machine These stats were generated by taking the average of about 10 runs of each algorithm, running in the standalone Flash Player (which seems significantly faster than in any browser):
# of items | Hashmap test | Bitmap test |
---|---|---|
10 items | 3 ms | 4 ms |
100 items | 4 ms | 4 ms |
1,000 items | 5 ms | 6 ms |
10,000 items | 26 ms | 26 ms |
100,000 items | 225 ms | 237 ms |
1 million items | 2340 ms | 2418 ms |
2 million items | 4874 ms | 5705 ms |
5 million items | 13389 ms | 14220 ms |
And here’s the full class that I created for testing this has table algorithm:
package
{
import flash.utils.Dictionary;
public class ArrayHashProcessor
{
private var fullArray:Array;
public function setFullArray(value:Array):void {
fullArray = value;
reset();
}
public function reset():void {
hashes = new Dictionary(true);
}
private var hashes:Dictionary;
public function filter(filterFunction:Function):Array {
var newHash:Array = [];
var filtered:Array = fullArray.filter( filterWithHash(filterFunction, newHash) );
hashes[filtered] = newHash;
return filtered;
}
private function filterWithHash(filterFunction:Function, hash:Object):Function
{
return function(item:Object, index:int, array:Array):Boolean {
var b:Boolean = filterFunction(item, index, array);
hash[index] = b;
return b;
};
}
public function process(array1:Array, array2:Array):Array {
var unique1:Array = [];
var unique2:Array = [];
var intersection:Array = [];
var union:Array;
var hash1:Array = hashes[array1];
var hash2:Array = hashes[array2];
fullArray.forEach(function(item:Object, i:int, a:Array):void {
if (hash1[i]) {
if (hash2[i]) {
intersection.push(item);
}
else {
unique1.push(item);
}
}
else if (hash2[i]) {
unique2.push(item);
}
});
union = unique1.concat(unique2).concat(intersection);
return [unique1, unique2, intersection, union];
}
}
}
There’s nothing real complicated here (and certainly nothing as visually interesting as using bitmaps). Basically you need to take advantage of using the filtering of the array as the place where you store the true/false value in the hash table (using the bitmap method this was where we called setPixel() as well). Doing that extra step in the filter function doesn’t add very much time to the filter function (but sticking a boolean into an Array is faster than calling setPixel on a bitmap). Then when you want to compare the arrays we just loop over each item in the full array and check the two hash tables.
So I’m a little sad to see my bitmap method gets surpassed by something so simple, but hey, that’s just how it works out sometimes. I suppose I should’ve tried this method first 🙂 but then I wouldn’t have gotten to play with bitmaps.
If you know of an even faster way to accomplish the same thing, let me know in the comments. I’m thinking about trying out using a ByteArray as an alternative to storing the hash lookups in a normal Array, but my gut tells me that using writeByte and readByte will be slower than simply setting array[index] = true, but who knows.
Overall I’m happy I tried out the bitmap approach (even if I won’t be using it). I think it’s a pretty interesting way to approach the problem and it helps illustrate that sometimes thinking outside the box can have some pretty cool results.
Interesting couple of articles.
I’d be keen to see how the new vector class performs against the array. it’s the same only its restricted to one datatype so it’s supposed to be more efficient.
Presumably this is one of the reasons bytearray and bitmapdata are quicker.
Don’t give up on the bitmaps, there are some very low-level operations there that can’t be quickly emulated any other way.
Here’s an example using three different filters to emulate cellular automata:
http://www.5etdemi.com/blog/archives/2005/12/flash-8-experiment-the-game-of-life-and-cellular-automata-using-convolutionfilter/
Hey Doug,
This is really interesting. It begs me to use the cliché thinking outside the box. I’d love it if you chimed in on benchmarking on this post : http://dispatchevent.org/mims/discussion-how-best-to-benchmark-flash/