Topamax Online Buy Prozac Aldactone Online Buy Toprol XL Stromectol Online Buy Amoxil Glucotrol Online Buy Stromectol Clarinex Online Buy Nexium
UPDATE: The test app has been updated to include two more tests: one for a hash table based approach, and one that uses a ByteArray approach. Looks like the ByteArray approach is the fastest, followed by the hash table, followed by the bitmap method. They’re all pretty close when you’re working with 10k or 100k records, but once you get into a million or more items then the ByteArray approach pulls ahead. Thanks to Alex Bustin for sharing his code for the ByteArray test.

Recently I was thinking about how to do some complex Array manipulation, and how to do it fast. I needed to take a large source Array and filter it two different ways, producing two new filtered arrays, each of which is made up of a subset of the values in the source array. Then I wanted to know which values of the first filtered array were unique to that array (did not appear in filtered array #2) and vice versa. I came up with a technique that uses the bitmap manipulation of the Flash player to accomplish these goals.

An example of the problem:
Let’s say I have a source array that looks like this:

Source Array
Item 1
Item 2
Item 3
Item 4
Item 5

Then I filter that array to produce a new array, Filtered Array #1, that contains some but not all of the items. I do the same thing again, except this time using a different filter that produces Filtered Array #2. Now my data looks like this:

Source Array Filtered Array #1 Filtered Array #2
Item 1 Item 2 Item 1
Item 2 Item 3 Item 2
Item 3 Item 5 Item 3
Item 4 Item 4
Item 5

I want to know the unique values in Filtered Array #1 that do not appear in #2 (Item 5) and the unique values in Filtered Array #2 that do not appear in #1 (Item 1, Item 4). An added bonus would be some extra stuff like being able to know the union of both arrays (items that appear in either array 1 or 2) or the intersection (items that only appear in both filtered arrays).

The slow-ass way
So the first thing that comes to mind when trying to solve this is that you can simply do a for loop over the source array and check if each item appears in Filtered Array #1 and Filtered Array #2. This means you have one full loop over every item in the data set, and then you need to do a check for whether the item exists in each of the filtered arrays. The most basic way would be something like this:

var unique1:Array = [];
var unique2:Array = [];

for(var i:int=0; i<sourceArray.length; i++) {
	var item:Object = sourceArray[i];
	var inArray1:Boolean = array1.indexOf(item) != -1;
	var inArray2:Boolean = array2.indexOf(item) != -1;

	if(inArray1 && !inArray2)
		unique1.push(item);
	else if(inArray2 && !inArray1)
		unique2.push(item);
}

So that’s the easiest way and it certainly works. I’m sure there are countless ways to do this in a more optimized way, like not looping over the source array but instead looping over the first filtered array, somehow marking all the items in the second filtered array if they were found in the first filtered array, then looping over the remaining items in the second filtered array. I’m sure someone will tell me that I should use a binary tree search since I know the source index, or that I should use a hashtable to lookup the existence of items faster. I bet all those different approaches would beat this simple example speed-wise, but I was still worried no matter how optimized an approach like this can get, it would still suck. (Side note: If you have an alternative approach that will handle many items with really fast performance, especially if you have an AS3 implementation, I’d love to hear about it.) So anyway, I decided I wanted to try something different.

Using bitmaps
Check this out. Say you have a source array of 10,000 items. Then you filter that down to a filtered array that might have a subset of about 4,000 of those items. The relationship of the items in the filtered array to the source array can be drawn using a bitmap like this:
screenshot024.png

This image is a 100 pixel by 100 pixel square, and each pixel within that square represents whether or not the item in the source array appears in the filtered array. So if the pixel at 0,0 is black that means that the first item in the source array is also in the filtered array. If a pixel is red, that means that that item does not appear in the filtered array.

Why did I choose to make a square image? I originally just did a single line 1px high. But Flash has a bitmap limitation of a max width or height of 2,880 pixels. So a single line means that you would be limited to arrays with only 2,880 items, and that’s not the kind of problem I was trying to solve. The performance need arose from having to do this with a shitload of items. So if we use a square instead (easy to calculate the dimensions, just take the square root of the # of items), then we have a theoretical maximum number of items of 2,880 x 2,880 = about 8.3 million. I’m not going to try to filter more than 8 million items client-side, so I’m ok with that limitation.

OK, cool. Now we do the same thing for the second filtered array. Below is the generated image of both filtered arrays side by side. The red image represents Filtered Array #1 and the green represents Filtered Array #2:
screenshot026.png

But how do you create these images in an optimized way? Since the arrays are already being filtered, that means I’m using Array.filter() with a callback function to decide whether or not the item passes the filter test. This filter function is going to be run on every item in the array. So I use that function to also do my bitmap drawing. And check this out, the callback for a filterFunction tells you the index of the item in the array! So that gives you all the information you need to draw these images. You know that the function will get called one time for each item, and it tells you the index. So just do a simple BitmapData.setPixelAt() and once your filtering is complete your image has been generated. Dope.

So now we start with the real bitmap magic. We’re going to use the compare() method of the BitmapData class to generate a third image that represents a pixel by pixel comparison of these two images. To use the compare() method you call it on one BitmapData and pass in a second BitmapData. The function will return a new BitmapData object that contains pixel comparisons. If the pixel in bitmap 1 is the same as the pixel in bitmap 2 then the new comparison bitmap will have a transparent pixel (0×00000000). If they are not the same, then it will return the “difference pixel”, which contains the difference between each channel (RGB). So what happens when we run compare() on these two images? We get a new image something like this:
screenshot027.png

There are only a few possible pixel values now, and each unique pixel value tells us something special. So by looping over the pixels we can know whether the item occurs in both arrays, one array or the other, or neither array. And just like that, we got our answer.

Try out the demo application I created here. And you can view the full source here. Be forgiving, the source code is dirty as hell… no comments, no nothin’… I basically just cranked this out and decided to put it out there without taking the time to clean it up because a) I think it’s a pretty cool approach and b) it’s dinner time and I’m hungry.

Performance
So what about some benchmarks? I’ve put together a little demo app that you can play with to try out this method. It lets you run two algorithms, one that is a very simple for loop that loops over each item and calls array1.indexOf(item) and array2.indexOf(item) to see if the item is present in both filtered arrays. Then you can also run the bitmap test that I’ve described. Here are the results that I had on my machine when running both techniques on various numbers of items:

# of items For loop text Bitmap test
10 items 4 ms 4 ms
100 items 4 ms 4 ms
1,000 items 19 ms 7 ms
10,000 items 1300 ms 32 ms
100,000 items timeout 350 ms
1 million items timeout 3700 ms

These aren’t meant to be scientific results. There are some pretty big factors that might influence speed. For the tests I have a filter function that just chooses at random (with 50% probability) which items to include in the two filtered arrays. If you change the number of items in either array the speeds will fluctuate. But the basic idea is that doing a for loop and checking for an item using indexOf fails miserably when you start getting over the tens of thousands of items. The bitmap method works even on a million records, albeit slowly at 3-4 seconds, but at least it works without timing out or crashing the browser.

And like I said before, I’m sure there are other algorithms that would be faster than looping over every item in the full array and checking using indexOf. If you have some implementations that solve the same problem but with better performance, please let me know in the comments.

So what?
I don’t really know if I’m going to actually use this technique anywhere, but I might. I’m also thinking about performing other calculations on large datasets using similar methods. Essentially if you have two or more datasets, you can use this graphical technique to perform all kinda of other calculations. I’m only using the compare() function to test for a few very distinct possibilities, but you could imagine doing something much more detailed, like performing calculations based on the aggregate color difference of multiple array images. The bitmap manipulation in Flash is so damn fast, that it opens a whole new door if you try to think about using those methods for number crunching and not just pretty graphics.

P.S. I know it’s been a long time since I did a blog post with code. I think this one geeks out to a pretty extreme level, so for anyone reading who was getting fed up with my non-technical ramblings, I hope this makes up for it.

del.icio.us:Using BitmapData for Array manipulation in AS3 digg:Using BitmapData for Array manipulation in AS3 spurl:Using BitmapData for Array manipulation in AS3 wists:Using BitmapData for Array manipulation in AS3 simpy:Using BitmapData for Array manipulation in AS3 newsvine:Using BitmapData for Array manipulation in AS3 blinklist:Using BitmapData for Array manipulation in AS3 furl:Using BitmapData for Array manipulation in AS3 reddit:Using BitmapData for Array manipulation in AS3 fark:Using BitmapData for Array manipulation in AS3 blogmarks:Using BitmapData for Array manipulation in AS3 Y!:Using BitmapData for Array manipulation in AS3 smarking:Using BitmapData for Array manipulation in AS3 magnolia:Using BitmapData for Array manipulation in AS3 segnalo:Using BitmapData for Array manipulation in AS3 gifttagging:Using BitmapData for Array manipulation in AS3
del.icio.us:SpatialKey featured on the cover of ComputerWorld digg:SpatialKey featured on the cover of ComputerWorld spurl:SpatialKey featured on the cover of ComputerWorld wists:SpatialKey featured on the cover of ComputerWorld simpy:SpatialKey featured on the cover of ComputerWorld newsvine:SpatialKey featured on the cover of ComputerWorld blinklist:SpatialKey featured on the cover of ComputerWorld furl:SpatialKey featured on the cover of ComputerWorld reddit:SpatialKey featured on the cover of ComputerWorld fark:SpatialKey featured on the cover of ComputerWorld blogmarks:SpatialKey featured on the cover of ComputerWorld Y!:SpatialKey featured on the cover of ComputerWorld smarking:SpatialKey featured on the cover of ComputerWorld magnolia:SpatialKey featured on the cover of ComputerWorld segnalo:SpatialKey featured on the cover of ComputerWorld gifttagging:SpatialKey featured on the cover of ComputerWorld
del.icio.us:Code samples from Adobe Flex 3.0 For Dummies digg:Code samples from Adobe Flex 3.0 For Dummies spurl:Code samples from Adobe Flex 3.0 For Dummies wists:Code samples from Adobe Flex 3.0 For Dummies simpy:Code samples from Adobe Flex 3.0 For Dummies newsvine:Code samples from Adobe Flex 3.0 For Dummies blinklist:Code samples from Adobe Flex 3.0 For Dummies furl:Code samples from Adobe Flex 3.0 For Dummies reddit:Code samples from Adobe Flex 3.0 For Dummies fark:Code samples from Adobe Flex 3.0 For Dummies blogmarks:Code samples from Adobe Flex 3.0 For Dummies Y!:Code samples from Adobe Flex 3.0 For Dummies smarking:Code samples from Adobe Flex 3.0 For Dummies magnolia:Code samples from Adobe Flex 3.0 For Dummies segnalo:Code samples from Adobe Flex 3.0 For Dummies gifttagging:Code samples from Adobe Flex 3.0 For Dummies
del.icio.us:Flex spotted on TV digg:Flex spotted on TV spurl:Flex spotted on TV wists:Flex spotted on TV simpy:Flex spotted on TV newsvine:Flex spotted on TV blinklist:Flex spotted on TV furl:Flex spotted on TV reddit:Flex spotted on TV fark:Flex spotted on TV blogmarks:Flex spotted on TV Y!:Flex spotted on TV smarking:Flex spotted on TV magnolia:Flex spotted on TV segnalo:Flex spotted on TV gifttagging:Flex spotted on TV
del.icio.us:Announcing SpatialKey - Geographic Information Without Limits digg:Announcing SpatialKey - Geographic Information Without Limits spurl:Announcing SpatialKey - Geographic Information Without Limits wists:Announcing SpatialKey - Geographic Information Without Limits simpy:Announcing SpatialKey - Geographic Information Without Limits newsvine:Announcing SpatialKey - Geographic Information Without Limits blinklist:Announcing SpatialKey - Geographic Information Without Limits furl:Announcing SpatialKey - Geographic Information Without Limits reddit:Announcing SpatialKey - Geographic Information Without Limits fark:Announcing SpatialKey - Geographic Information Without Limits blogmarks:Announcing SpatialKey - Geographic Information Without Limits Y!:Announcing SpatialKey - Geographic Information Without Limits smarking:Announcing SpatialKey - Geographic Information Without Limits magnolia:Announcing SpatialKey - Geographic Information Without Limits segnalo:Announcing SpatialKey - Geographic Information Without Limits gifttagging:Announcing SpatialKey - Geographic Information Without Limits
del.icio.us:Flex for Dummies arrived! w00t!! digg:Flex for Dummies arrived! w00t!! spurl:Flex for Dummies arrived! w00t!! wists:Flex for Dummies arrived! w00t!! simpy:Flex for Dummies arrived! w00t!! newsvine:Flex for Dummies arrived! w00t!! blinklist:Flex for Dummies arrived! w00t!! furl:Flex for Dummies arrived! w00t!! reddit:Flex for Dummies arrived! w00t!! fark:Flex for Dummies arrived! w00t!! blogmarks:Flex for Dummies arrived! w00t!! Y!:Flex for Dummies arrived! w00t!! smarking:Flex for Dummies arrived! w00t!! magnolia:Flex for Dummies arrived! w00t!! segnalo:Flex for Dummies arrived! w00t!! gifttagging:Flex for Dummies arrived! w00t!!
del.icio.us:I just bought Flex 3 For Dummies! digg:I just bought Flex 3 For Dummies! spurl:I just bought Flex 3 For Dummies! wists:I just bought Flex 3 For Dummies! simpy:I just bought Flex 3 For Dummies! newsvine:I just bought Flex 3 For Dummies! blinklist:I just bought Flex 3 For Dummies! furl:I just bought Flex 3 For Dummies! reddit:I just bought Flex 3 For Dummies! fark:I just bought Flex 3 For Dummies! blogmarks:I just bought Flex 3 For Dummies! Y!:I just bought Flex 3 For Dummies! smarking:I just bought Flex 3 For Dummies! magnolia:I just bought Flex 3 For Dummies! segnalo:I just bought Flex 3 For Dummies! gifttagging:I just bought Flex 3 For Dummies!
del.icio.us:Why did I stop blogging? digg:Why did I stop blogging? spurl:Why did I stop blogging? wists:Why did I stop blogging? simpy:Why did I stop blogging? newsvine:Why did I stop blogging? blinklist:Why did I stop blogging? furl:Why did I stop blogging? reddit:Why did I stop blogging? fark:Why did I stop blogging? blogmarks:Why did I stop blogging? Y!:Why did I stop blogging? smarking:Why did I stop blogging? magnolia:Why did I stop blogging? segnalo:Why did I stop blogging? gifttagging:Why did I stop blogging?
del.icio.us:Description of my Flash on the Beach session: Decompiling Flex and Flash digg:Description of my Flash on the Beach session: Decompiling Flex and Flash spurl:Description of my Flash on the Beach session: Decompiling Flex and Flash wists:Description of my Flash on the Beach session: Decompiling Flex and Flash simpy:Description of my Flash on the Beach session: Decompiling Flex and Flash newsvine:Description of my Flash on the Beach session: Decompiling Flex and Flash blinklist:Description of my Flash on the Beach session: Decompiling Flex and Flash furl:Description of my Flash on the Beach session: Decompiling Flex and Flash reddit:Description of my Flash on the Beach session: Decompiling Flex and Flash fark:Description of my Flash on the Beach session: Decompiling Flex and Flash blogmarks:Description of my Flash on the Beach session: Decompiling Flex and Flash Y!:Description of my Flash on the Beach session: Decompiling Flex and Flash smarking:Description of my Flash on the Beach session: Decompiling Flex and Flash magnolia:Description of my Flash on the Beach session: Decompiling Flex and Flash segnalo:Description of my Flash on the Beach session: Decompiling Flex and Flash gifttagging:Description of my Flash on the Beach session: Decompiling Flex and Flash
del.icio.us:How Adobe's special search indexing Flash Player works digg:How Adobe's special search indexing Flash Player works spurl:How Adobe's special search indexing Flash Player works wists:How Adobe's special search indexing Flash Player works simpy:How Adobe's special search indexing Flash Player works newsvine:How Adobe's special search indexing Flash Player works blinklist:How Adobe's special search indexing Flash Player works furl:How Adobe's special search indexing Flash Player works reddit:How Adobe's special search indexing Flash Player works fark:How Adobe's special search indexing Flash Player works blogmarks:How Adobe's special search indexing Flash Player works Y!:How Adobe's special search indexing Flash Player works smarking:How Adobe's special search indexing Flash Player works magnolia:How Adobe's special search indexing Flash Player works segnalo:How Adobe's special search indexing Flash Player works gifttagging:How Adobe's special search indexing Flash Player works