Art, Maps

Hurricane Coasters

I’ve been creating maps of hurricanes for the last few years. My job involves writing software to assess the impact of an approaching storm, so during an active hurricane season I become a bit obsessed. When you watch these things so closely and so constantly they get burned into your memory. They almost enter your subconscious. They become inkblots on a Rosharch test. Irene, Isaac, Sandy. I started seeing hurricanes everywhere.


I decided to create something that took these shapes that I was so familiar with. These shapes that were burned into my head. These shapes had become such a constant presence in my mind I figured it was only fitting to make them a constant presence in my home as well, so I created a set of coasters.


Continue reading

Art, Data Visualization, Maps

Physical Maps – My 360|intersect Presentation

I recently had the privilege  of presenting at 360|intersect in Seattle. My talk explored creating physical map using various techniques like 3D printing, laser cutting, etc. This was a much more artistic exploration than a lot of my previous work, and I am incredibly proud of the pieces I produced. The full video of my presentation is embedded below. I’ll be documenting each of the projects I created in future blog posts, but for now here’s the 45-minute long presenation:

And if you just want to see an overview of the various projects before watching the video, here’s a quick preview of all the projects that I talk about in the presentation:



Hackers and Depression: Inform Yourselves About CBT

My wife is a clinical psychologist. Over the past week we’ve had long discussions about Cognitive Behavioral Therapy (CBT), which is a certain type of therapy that is focused on using evidence-based methods (read: there have been studies showing effectiveness), with a particular emphasis on rational reasoning and pragmatic ways to tackle issues like depression and anxiety. The overlap with programming in terms of the way of thinking is astounding.

As a community, we rarely talk about mental illness. It takes high profile cases like Aaron Swartz’s suicide to get us to even bring up the subject, but more than likely we’ll revert back to our isolation and pretend like depression isn’t a serious issue in the tech world. We need to face depression, not sweep it under the rug.

If you or someone you care about is dealing with depression, please take a look at CBT. This article is a joint effort between a programmer (me) and a psychologist (my wife). I bet it’s the first article about therapy you’ve seen that uses code snippets to illustrate points.

CBT is for Hackers:

The tragedy to me is this: one of the most effective and scientifically-backed treatments for depression appears to be a stunning fit for hackers, and yet few people know about it. It’s called Cognitive Behavioral Therapy (CBT), and it has some of its origins in computer science.

Born out of the cognitive revolution of the 1950s, a key idea within cognitive psychology is that by studying successful functions in computer science, it becomes possible to make testable inferences about human psychological processes. Cognitive behavioral therapists mirror hackers in how they see the world and approach problems. They share the same core values: an emphasis on problem solving as efficiently and effectively as possible, using logic to debug a system, gathering data to test out what works and what doesn’t, and implementing transparent methods that others can understand and replicate as opposed to simply putting your faith in a “magic black box”. CBT and hackers are long lost kindred spirits, yearning to be reunited.

Read the full CBT is for Hackers article.

Data Visualization

Drug Side-Effect Warnings as Word Clouds

I’m always amazed at how happy the voice on TV commercials sounds when describing a litany of horrible-sounding side-effects from a prescription medication. I was just watching the nice cartoon lady telling me about Abilify when I overheard this:

<queue birds chirping> <nice music playing> Contact your doctor if you have uncontrollable muscle movements, as these could become permanent. Other risks include decreases in white blood cells, which can be serious, dizziness upon standing, seizures —

Errr, wait, what the fuck? Can we not have the nice guitar strumming along with a quaint melody in the background while you tell me that if I take this pill I might not be able to control my muscles and might start having seizures? Your soothing, monotonous tone is both putting me to sleep and freaking me out at the same time.

We’ve all heard these side-effect warnings in commercials, or seen them on packaging in the tiniest tiny print. It’s not uncommon to hear some soothing voice say something like “Side effects include headache, drowsiness, sore throat, and death.” Uhhhh. I’m ok with most of those, but one of these things is not like the other.

And yet the commercials or fine print don’t really tell you what’s likely to be a side effect versus what’s unlikely. Turns out, though, that if you do some research, the significance of the side effects of various prescription pills are available online. You just have to dig. For example, here’s the product sheet for Zoloft. And it has a section about side effects that looks like this:

Now we’re getting some real numbers. If only there was a way to quickly see what the most common side effects of various drugs were at a glance.

Here’s my take on redesigning the information presentation. We’ll start off with a fun one, which is the popular anti-depressant Zoloft:

Ain’t that a bitch? I guess the good news is if you’re nauseous, then ejaculation failure might not be that big a concern. The side-effects are sized by computing the difference in the percentages between the placebo group and the group taking the medication. In this case 14% of patients taking Zoloft experienced ejaculation failure, versus only 1% in the control group.

Here’s another anti-depressant, Abilify (source data):

And now of course we have other drugs to counteract some of these side-effects, so why not trying to counteract the negative Zoloft effects by popping a Viagra? Here are the new side effects you get to enjoy (source data):

And once that Viagra’s worn off you might be looking for a cigarette. But try Nicotrol (details) instead, you’ll get to take your chances with the following side effects:

Now at a glance you can see what you need to worry about and what you don’t. I imagine these beautiful labels on the side of the prescription boxes 🙂 Well, at least I can dream.


RIP Aaron Swartz

I’m in academic publishing. My grandparents founded a publishing company. My father ran it for a decade. I sit on the board of directors. You could say academic publishing is in my blood.

Today I am nauseous. Aaron Swartz is dead. I don’t know whether or not he would be alive today if he wasn’t prosecuted so aggressively for “stealing” academic journal articles. But what I do know is that this is a dark day in our history. It is a stain on the entire academic publishing industry.

I fiercely believe that as academic publishers we make the world a better place. We do good. I also believe there is a place for publishers in the Internet age. We’re working hard to figure out how to navigate these times. But everyone involved in this industry should be ashamed today.

We lost a genius. We lost a rebel.

I’m proud to be in publishing. But today I am nauseous. Today I am deeply sad. Today I am ashamed.


My Apache Flex Logo Contest Submissions

Now that Flex is being moved to the Apache Software Foundation, it’s time for a new logo. A logo contest is currently underway (ends today I think). Here are my two submissions. Each one has more detailed variations and explanations of the thought process if you view the full submission.

Logo 1

Main Themes: Cross-platform, progress, advancing forward, new beginnings

This logo is meant to combine the symbols of an arrow and an X. The arrow means “moving forward”, which has a number of connotations (moving forward with Apache, a fresh start for the project, advancing the state of the art in web/desktop/mobile development). The X means “cross platform”, which should be pretty self-explanatory to anyone who uses Flex. The combination of the two symbols means “Advancing cross-platform development.”

See the full treatment with explanation.

Logo 2

Main Themes: Stability, strength, enterprise

The second logo tries to capture the enterprise story. Flex is the foundation of many enterprise applications. It provides a core set of components and tools, on top of which we build stable, powerful, robust applications that drive real businesses. This logo has Flex as a strong base. Built on top we have a symbolic chart, but this symbol is also meant to represent a skyline of skyscrapers. Our apps power large enterprises and drive business. Flex is the foundation of enterprise development.

See the full treatment with explanation.

Data Visualization

What would you call this chart?

I’m working on a visualization of people logging into SpatialKey, and I’ve come up with the following table/chart.

(larger version)

Each row represents one customer, and each cell is one day. If the cell is blue that means the customer logged in that day. Otherwise the cell is gray (lighter gray for weekends to give some context to the timeline). So cells are either on or off, I’m not trying to show how much usage there was on a particular day, just that there was some usage.

The only other thing beyond the on/off blue/gray state is I’m trying to highlight customers who have stopped logging in. So if a customer has previously been logging in regularly and then they stop for a long time, I highlight the row in red to show how long it’s been since we last saw the customer.

What kind of chart is this? There’s got to be a name, but I don’t know even know what to google to figure it out. I’ve created these before in Excel using conditional formatting of cell background color. This one is being created with d3.js using SVG (more posts on d3 will likely be coming).

The first thing I was reminded of was DNA sequencing using gel electrophoresis.

It also reminded me of the great Lite Brite that I used to play with as a kid.

So I’m currently going back and forth between “DNA Sequence chart” and “Lite Brite chart”, but there’s got to be a better term…

Art, Data Visualization, Maps

Printing Hurricanes as Gifts

We had a very busy month of August at SpatialKey as Hurricane Irene tore through the east coast. Our insurance customers were constantly watching Irene as it built up and approached land, then as it swept through parts of North Carolina, Vermont, New York, etc, and then as it died out quietly. We were writing software to visualize hurricane forecasts in real-time, as the storm was approaching, and getting immediate feedback from our customers. It was all a bit stressful, but exhilarating.

I wanted to have some kind of gift of thanks to give to our most helpful customers, who worked closely with us, helping us develop our hurricane product. To be honest, it felt like we all weathered a storm together during that hectic week in August. We brainstormed on sending out shirts, or bags, or some other standard corporate gear, but none of it really felt like “us”. So I came up with a more unique gift that I think captures our culture.

This is a 3D model of Hurricane Irene. The height of the model represents the wind speed at that location. You can see there are 3 bands of different wind speeds. The outer band represents where wind speeds hit 39 mph, the next band represents 58 mph, and the third band represents 74 mph (hurricane force winds). Then running through the middle we have the path of the eye of the storm, and the height of that track represents the exact speed at that point in time (Irene got up to 120 mph).

I created the model by taking the GIS data straight from NOAA and using that to build up the 3D model by hand. Then I sent the 3D model off to Shapeways for printing. The printed version you see in the photos is made out of alumide, which is sort of a composite aluminum material.

For our customers who were working with us while Irene passed through, we hope this will be a nice reminder of the work we did. It’s just a little paperweight to sit on your desk, but for those who were watching Irene as it developed and keeping a very close eye on the footprint of the storm, I think it’s a nice memento.

A hurricane can be a difficult concept to understand. For those affected in its path, it’s an incredibly tangible, visceral thing. But for those watching from afar (like me, sitting in California), it’s less “real”. We hear the overly-dramatic news reports and the doom-and-gloom predictions, but it’s a purely theoretical experience. Having a little paperweight of the storm on my desk doesn’t really help me understand the true impact Irene had on all those folks along the east coast, but at least I can touch it.

Data Visualization

Visualizing Time with the Infinity Hour Chart

This is another experiment in visualizing 24-hour cyclical data. My last post explored a method of linear representation (the Double Time Bar Chart). Linear representations have problems when it comes to showing the cyclical nature of time data (ie there is no start or end of a 24 hour cycle).


When trying to think of visual representations of never-ending cycles I was inspired by the infinity symbol. It’s a great symbol to show a continuous cycle, while at the same time being more visually interesting than a simple circle (fun fact: the infinity symbol dates back to 1655). The other iconography that came to mind when thinking about infinity is the hour glass. An hour glass not only represents time, but it also looks similar to a vertical infinity symbol.

My thought was that maybe I could combine the two to create a vertical infinity symbol that evokes the metaphor of an hour glass.

Back of the napkin

My original sketch of this concept was done on the back of a napkin. This is the first sketch, which shows how I was originally working with a horizontal infinity symbol.

I experimented with a few different options for how to show the data using fills. One of the sketches (if turned vertically) looks like an hour glass filling up with water on the bottom, reminiscent of the Wikileaks logo.

Drawing Infinity

The mathematical name for the infinity symbol is lemniscate, and more specifically the lemniscate of Bernoulli. With some good Googling you can find algorithms to draw the lemniscate of Bernoulli, which is what I did.

To start I divided the lemniscate into 24 segments, one for each of the hours of the day. My initial plot of the lemniscate in 24 parts looked like this:

I mapped the hours of the day onto this form, with 12pm noon at the very top and the infinity symbol crossing itself at 6pm/6am.

You follow the time by working your way around the infinity. If you start at the top of the symbol at noon, you would start moving around clockwise to 1pm, then 2pm, etc. You’ll reach the center at 6pm, at which point the symbol crosses itself and you then read it counter-clockwise around the bottom.

What you end up with is a way of dividing up the times of day into quadrants. The top-left quarter of the image is the morning, from 6am-12pm. Then the top-right is the afternoon, from 12pm-6pm. Then you have the evening in the botom left (6pm-midnight) and then late-night is in the bottom-right (12am-6am). These quarters match well with how I mentally categorize times of day.

Because the form crosses over itself you can actually read the chart almost in a left-to-right way for both the day (top) and night (bottom).

Drawing Data

The next step is to try to use this form to represent real data. Here’s an example that shows the distribution of driving under the influence arrests in San Francisco:

We can see that this particular crime is primarily a night-time activity that surges around midnight and starts falling off after about 2am. I’ve colored the range of 6am-6pm in orange to show day-time and the range of 6pm-6am in blue to indicate night-time.

For comparison here’s apartment burglary, which is mostly a day-time activity:

Once the viewer understands how to read the chart we can remove the labels and simply show the pattern. Here’s a comparison of a few night-time crimes:

Here’s a comparison of different types of burglary, some of which occur mostly during the daytime (residential burglary) and some of which occur in the afternoon and late at night (burglary of a store).

Small Multiples

Here’s a final example of many different crime types represented side by side to try to see how this chart works for comparisons.


I’m not very happy with this chart in terms of the viewer’s ability to accurately read the chart. I also don’t think it highlights changes between hours enough. Often there are changes and trends that are easy to spot in the linear charts of my last post, but that are very difficult to see in these charts. Each hour is at quite a different angle than the hours on either side, which makes it difficult to compare two hours. You still get the big picture trends, like if a crime is a night-time or day-time crime, but the smaller trends are much harder to spot.

On the flip side, I really like the metaphors of the infinity symbol and the hourglass. On an artistic and philosophical level I think those metaphors make this a really beautiful visualization. Too bad it’s not also effective 🙂

Data Visualization

Visualizing Time with the Double-Time Bar Chart

In my last post I described some of the issues with visualizing cyclical data by hour of day and covered a few examples of different visualization methods that are typically used. This post is more a visualization experiment.

The Context

To start with a little context, for my day job I create a software product called SpatialKey, which is a business intelligence/data visualization tool. We can visualize all sorts of data all sorts of ways, but one of the things we do is show you a histogram of the occurrence of your data by the hour of day. The chart looks something like this:

The Problem

That’s about as simple as you can get, with a single series of data displayed as a bar chart. The section on line charts in my previous post covered some of the problems with these visualizations. I have two issues with this chart:

  • the break in the data between 11pm and midnight
  • the difficulty understanding the context of the time

To summarize, the first problem has to do with being able to understand the trends that occur around midnight (where this chart breaks the data). In this example we can see that data in the evening peaks at 9pm and then declines, but it’s difficult to accurately assess that declining pattern because you have to try to follow the data as it ends on the right edge of the chart and then continues all the way over on the left edge. This is only problematic when something interesting is happening around midnight (or whenever you choose to have your chart begin/end).

The second point about context has to do with the fact that I don’t think about my days as starting at midnight and ending at 11:59pm. A more accurate representation of how I think of my days is that they start sometime when I wake up, usually around 7am, and they are broken up into “day-time” and “night-time”, and they end more or less when I go to sleep. Within “day-time” my day is broken up into other categorizations, like “working hours”, “afternoon”, “lunch-time”, etc. And depending on the data in question, these contextual relationships might be incredibly important. For this post I’ll be looking at crime data. When you’re investigating crime data, the contextual relationship to the time of day can be incredibly relevant. I don’t just want to know about when people are assaulted, I want to know the rate of assaults on the street when I’m going to be walking on the street (typically right after work on my way home, or later at night going out to dinner, bars, etc).

The simple bar chart doesn’t solve these problems well. It presents a hard break in the data, forcing the viewer to mentally connect the end of the chart with the beginning. And it also forces the viewer to think about the days in the context of midnight – 11pm, which is not the natural categorization system we have for the hours of the day.

The Double-Time Bar Chart

My first attempt to address some of these problems is something I’m tentatively calling the Double-Time Bar Chart. The goal is to put the time in context a bit more for the viewer, and to always show a relevant, continuous visualization of all times of the day.

The chart still uses simple bars in a linear chart. But the data is actually shown twice in the chart. The top part of the chart is the exact same histogram chart with 24 bars that we had before, going from midnight to 11pm. The bottom part is the same data (upside down), but it starts instead at noon and goes to 11am. It’s shifted by 12 hours compared to the top chart. Imagine taking the top chart, flipping it upside down, then shifting it over to the right by 12 bars.

There’s a single x-axis for both the top and bottom charts, which is labelled with the hours of the day. But the hours are either AM for the top chart or PM for the bottom chart.

The highlighted regions represent 6am-5pm on the top and 6pm-5am on the bottom. That means there are 24 highlighted bars, so the highlighted bars represent one unduplicated set of 24 hours of data. The highlight is used to draw attention to day-time and night-time activities. A very rough color categorization is used to color 6am-5pm in a lighter yellow, representing day-time, and 6pm-5am in a darker color, representing night. I realize this doesn’t match up with actual sunlight/darkness times in most cities, but I think the 6am-6pm time range is close enough to how many people think about “day” vs “night” that it works.

The duplicated (but shifted) data in the top and bottom allows me to see a continuous, unbroken series of data that can show day-time activity (top) or night-time activity (bottom). There is no hour of the day that forces me to read the chart to the end and then continue on by moving my attention back to the beginning. If I’m interested in the trends during the day (say around lunchtime, so 11am-1pm) then I can read the top chart. But if I’m interested in night-time activity (say 11pm-1am) then I can read the bottom chart. In both cases I get a continuous chart that shows the full context of all the data around the range in which I’m interested.

The highlighted regions serve to draw attention to daytime versus nighttime, but we still keep the rest of the 24 hours visible in each chart (the unhighlighted bars) so you can always get the full context of the data. This allows you to follow the data from 4pm-8pm without forcing your eyes to jump from the top to the bottom.


For these examples I’ll be visualizing crime data from the city of San Francisco. I’m using two full years of crime, 2009 and 2010. You can download the crime data yourself if you want to play with it.

One note about these charts: there are no y-axis labels and each chart is relative to itself. I was interested in exploring the problem of visualizing the hourly patterns, not necessarily being able to know exactly how many crimes occurred at a certain hour. The highest bar in each chart does not always mean the same value. It simply means that’s the hour with the most crimes for that particular crime type.

Here’s an example of a crime the has an interesting day-time pattern, burglary. Notice the nice peak right when everyone leaves their homes unguarded as they go off to work.

And here’s a contrasting example of a crime that’s primarily a night-time activity, public intoxication.

Notice the nice nearly-linear build up all the way from about 9am up to the peak at midnight, then the dropoff after 2am (when the bars close in San Francisco).

There are a few crimes that are even more polarized. Arrests for driving under the influence have a nice distribution curve that peaks at midnight.

And prostitution is also primarily a night-time activity in San Francisco. There are two peaks, one just after work around 6-7pm, and then another a bit later in the evening at 11pm.

Small Multiples for Comparison

One way to compare different kinds of data is to use small multiples, which relies on small charts all laid out together to make it easy for your eyes to scan. These Double-Time charts work well in small multiples because you can quickly scan to see the difference between predominantly daytime crimes (large yellow areas in the top half) versus night-time crimes (blue areas in the bottom half). For instance, to get a better view of burglary, we can look at the sub-categorizations.

We can see that residential burglaries occur in the morning when people leave for work, whereas burglaries of a store are either late-afternoon or evening crimes.

The same approach can be used to compare many different types of crimes:

Or we can remove the x-axis and strip down the extra whitespace in the charts to get an even more compact view:

Summary/Revisiting the Goals

Now to circle back around to what I was trying to accomplish with this type of chart. There were two main goals: preserving the continuity of the data and putting the data into the context of your day.

To preserve continuity I’ve duplicated the data, which allows for a nice continuous linear chart that covers any important time range. If you’re interested in day-time trends you can look at the top chart. If you’re interested in night-time trends you can look at the bottom chart. But in either case you get a full, continuous range to put the trend in context.

To further put the data in context I’ve added some simple coloring to highlight the day-time vs night-time ranges. The x-axis labels (showing 6am, noon, 6pm, etc) give you some further context that helps you categorize the data. If you split the chart in quadrants you get rough categories for morning (top-left), afternoon (top-right), evening (bottom-left) and night (bottom-right).

The Big Caveat

This is just a simple design experiment. I’m making no claims about the efficacy of this chart. I have not run any studies to validate that this chart is clear to viewers or is any better (or worse) than any other visualization. I don’t even know if I myself think this is an effective chart. I’m just trying to spur a bit of a discussion and some experimentation around the problem of visualizing cyclical hourly data. So what do you think?

I’ll be posting another (even more experimental) take on this same topic shortly.