{"id":336,"date":"2008-09-14T14:24:28","date_gmt":"2008-09-14T22:24:28","guid":{"rendered":"http:\/\/dougmccune.com\/blog\/2008\/09\/14\/followup-for-using-bitmapdata-for-array-manipulation-using-a-hashmap-is-a-little-faster\/"},"modified":"2008-09-14T18:59:21","modified_gmt":"2008-09-15T02:59:21","slug":"followup-for-using-bitmapdata-for-array-manipulation-using-a-hashmap-is-a-little-faster","status":"publish","type":"post","link":"https:\/\/dougmccune.com\/blog\/2008\/09\/14\/followup-for-using-bitmapdata-for-array-manipulation-using-a-hashmap-is-a-little-faster\/","title":{"rendered":"Followup for Using BitmapData for Array manipulation: using a hashmap is (a little) faster"},"content":{"rendered":"<div style=\"border:1px solid red; background-color:#efefef; padding:10px;\">\n<strong>UPDATE: <\/strong>The <a href=\"http:\/\/dougmccune.com\/flex\/bitmapArrayTests\/ArrayTests.html\">test app<\/a> has been updated to include a third test that uses a ByteArray approach. Looks like the ByteArray approach is the fastest, followed by the hash table, followed by the bitmap method. They&#8217;re all pretty close when you&#8217;re working with 10k or 100k records, but once you get into a million or more items then the ByteArray approach pulls ahead. Thanks to <a href=\"http:\/\/thebackbutton.com\/\">Alex Bustin<\/a> for sharing his code for the ByteArray test.\n<\/div>\n<p>After I <a href=\"https:\/\/dougmccune.com\/blog\/2008\/09\/13\/using-bitmapdata-for-array-manipulation-in-as3\">posted my technique<\/a> for using the Flash bitmap drawing methods to perform calculations on arrays, <a href=\"http:\/\/www.tom-carden.co.uk\/\">Tom Carden<\/a> started posting a few test of his own that tried to optimize some code-only algorithms. In his test he got an algorithm that used a hash table lookup method to be almost as fast as the bitmap method (but in his tests the hash table method was still slower than using bitmaps). I&#8217;ve created a slightly modified version of a hashmap algorithm that squeaks out just ahead of the bitmap test.<\/p>\n<p>The algorithm basically just uses a very simple Array that holds boolean values that represent whether or not a particular item exists in the filtered array. So this is just like using the bitmap approach, except instead of creating different colored pixels, you just set the proper index in the array to true or false. Then once you have those two array of true\/false values for both your filtered arrays, it&#8217;s just a matter of looping over the full array and checking the hash table arrays to see if a particular item exists in one, both, or neither of the filtered arrays.<\/p>\n<p>And it turns out this is faster (but not by that much) than the bitmap method. But if you&#8217;re looking for the fastest way to take two arrays and find the unique values in either or the intersection or union, using this hash table method is the fastest way I&#8217;ve found so far. Too bad the bitmap stuff didn&#8217;t win, I was rooting for it&#8230; oh well, so it goes.<\/p>\n<p>I&#8217;ve updated the <a href=\"http:\/\/dougmccune.com\/flex\/bitmapArrayTests\/ArrayTests.html\">test application<\/a> (<a href=\"http:\/\/dougmccune.com\/flex\/bitmapArrayTests\/srcview\/index.html\">source<\/a>) to include this new algorithm too, and here are the performance test results on my machine These stats were generated by taking the average of about 10 runs of each algorithm, running in the standalone Flash Player (which seems significantly faster than in any browser):<\/p>\n<table border=\"1\">\n<tr>\n<th># of items<\/th>\n<th>Hashmap test<\/th>\n<th>Bitmap test<\/th>\n<\/tr>\n<tr>\n<td>10 items<\/td>\n<td>3 ms<\/td>\n<td>4 ms<\/td>\n<\/tr>\n<tr>\n<td>100 items<\/td>\n<td>4 ms<\/td>\n<td>4 ms<\/td>\n<\/tr>\n<tr>\n<td>1,000 items<\/td>\n<td>5 ms<\/td>\n<td>6 ms<\/td>\n<\/tr>\n<tr>\n<td>10,000 items<\/td>\n<td>26 ms<\/td>\n<td>26 ms<\/td>\n<\/tr>\n<tr>\n<td>100,000 items<\/td>\n<td>225 ms<\/td>\n<td>237 ms<\/td>\n<\/tr>\n<tr>\n<td>1 million items<\/td>\n<td>2340 ms<\/td>\n<td>2418 ms<\/td>\n<\/tr>\n<tr>\n<td>2 million items<\/td>\n<td>4874 ms<\/td>\n<td>5705 ms<\/td>\n<\/tr>\n<tr>\n<td>5 million items<\/td>\n<td>13389 ms<\/td>\n<td>14220 ms<\/td>\n<\/tr>\n<\/table>\n<p>And here&#8217;s the full class that I created for testing this has table algorithm:<\/p>\n<pre><code>\r\npackage\r\n{\r\nimport flash.utils.Dictionary;\r\n\r\npublic class ArrayHashProcessor\r\n{\r\nprivate var fullArray:Array;\r\npublic function setFullArray(value:Array):void {\r\n\tfullArray = value;\r\n\treset();\r\n}\r\n\r\npublic function reset():void {\r\n\thashes = new Dictionary(true);\r\n}\r\n\r\nprivate var hashes:Dictionary;\r\n\r\npublic function filter(filterFunction:Function):Array {\r\n\tvar newHash:Array = [];\r\n\t\r\n\tvar filtered:Array = fullArray.filter( filterWithHash(filterFunction, newHash) );\r\n\thashes[filtered] = newHash;\r\n\t\r\n\treturn filtered;\r\n}\r\n\r\nprivate function filterWithHash(filterFunction:Function, hash:Object):Function\r\n{\r\n\treturn function(item:Object, index:int, array:Array):Boolean {\r\n\t\tvar b:Boolean = filterFunction(item, index, array);\r\n\t\thash[index] = b;\r\n\t\treturn b;\r\n\t};\r\n}\r\n\r\npublic function process(array1:Array, array2:Array):Array {\r\n\r\n    var unique1:Array = [];\r\n    var unique2:Array = [];\r\n    var intersection:Array = [];\r\n    var union:Array;\r\n\r\n\tvar hash1:Array = hashes[array1];\r\n\tvar hash2:Array = hashes[array2];\r\n\t\r\n    fullArray.forEach(function(item:Object, i:int, a:Array):void {\r\n        if (hash1[i]) {\r\n            if (hash2[i]) {\r\n                intersection.push(item);\r\n            }\r\n            else {\r\n                unique1.push(item);\r\n            }\r\n        }\r\n        else if (hash2[i]) {\r\n            unique2.push(item);\r\n        }\r\n    });\r\n                \r\n    union = unique1.concat(unique2).concat(intersection);\r\n    return [unique1, unique2, intersection, union];\r\n}        \r\n\r\n}\r\n}\r\n<\/code><\/pre>\n<p>There&#8217;s nothing real complicated here (and certainly nothing as visually interesting as using bitmaps). Basically you need to take advantage of using the filtering of the array as the place where you store the true\/false value in the hash table (using the bitmap method this was where we called setPixel() as well). Doing that extra step in the filter function doesn&#8217;t add very much time to the filter function (but sticking a boolean into an Array is faster than calling setPixel on a bitmap). Then when you want to compare the arrays we just loop over each item in the full array and check the two hash tables.<\/p>\n<p>So I&#8217;m a little sad to see my bitmap method gets surpassed by something so simple, but hey, that&#8217;s just how it works out sometimes. I suppose I should&#8217;ve tried this method first \ud83d\ude42 but then I wouldn&#8217;t have gotten to play with bitmaps.<\/p>\n<p>If you know of an even faster way to accomplish the same thing, let me know in the comments. I&#8217;m thinking about trying out using a ByteArray as an alternative to storing the hash lookups in a normal Array, but my gut tells me that using writeByte and readByte will be slower than simply setting array[index] = true, but who knows.<\/p>\n<p>Overall I&#8217;m happy I tried out <a href=\"https:\/\/dougmccune.com\/blog\/2008\/09\/13\/using-bitmapdata-for-array-manipulation-in-as3\">the bitmap approach<\/a> (even if I won&#8217;t be using it). I think it&#8217;s a pretty interesting way to approach the problem and it helps illustrate that sometimes thinking outside the box can have some pretty cool results.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>UPDATE: The test app has been updated to include a third test that uses a ByteArray approach. Looks like the ByteArray approach is the fastest, followed by the hash table, followed by the bitmap method. They&#8217;re all pretty close when you&#8217;re working with 10k or 100k records, but once you get into a million or [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[2],"tags":[7],"class_list":["post-336","post","type-post","status-publish","format-standard","hentry","category-flex","tag-actionscript"],"aioseo_notices":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/dougmccune.com\/blog\/wp-json\/wp\/v2\/posts\/336","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dougmccune.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dougmccune.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dougmccune.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/dougmccune.com\/blog\/wp-json\/wp\/v2\/comments?post=336"}],"version-history":[{"count":0,"href":"https:\/\/dougmccune.com\/blog\/wp-json\/wp\/v2\/posts\/336\/revisions"}],"wp:attachment":[{"href":"https:\/\/dougmccune.com\/blog\/wp-json\/wp\/v2\/media?parent=336"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dougmccune.com\/blog\/wp-json\/wp\/v2\/categories?post=336"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dougmccune.com\/blog\/wp-json\/wp\/v2\/tags?post=336"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}