Ethics and the use of DUI data

I do a lot of work with San Francisco crime data, and one of the things that I’ve been struggling with is one particular dataset: the locations of all the driving under the influence (DUI) arrests in the city. Just yesterday there was an article about US Senators asking Apple to remove DUI checkpoint applications from the app store.

San Francisco publishes a huge amount of crime data, going all the way back to 2003. You can grab a single CSV file with all the data. Over a million crimes. It’s beautiful.

If you look at just the DUI records you start seeing patterns. Here’s about a thousand DUIs over the past 2 years (2009-2010). Click any of these images for larger versions of the maps.

If we look at a density map individual streets start lighting up. Specific intersections stand out.

Here’s a representation that assigns the number of DUIs to the street segment they occurred on and colors the data like a typical traffic map.

And finally just for fun, here’s a 3D rendering of the same 2 years of data:

It’s compelling data, and fairly easy to tell an interesting story. But is there an ethical issue around visualizing or using this data? There’s a lot that you can do with the data, obviously visualizations like this are just scratching the surface.

An idea that crosses the line

Following one train of thought to its logical conclusion leads me to a mobile app idea. It’s a simple app, essentially just a routing application. You type in where you’re going and you can get directions from your current location, just like any other mapping or GPS routing application. Except we can give you directions that avoid known DUI hotspots. In a very simplified sense, routing algorithms basically give streets a score, usually determined based on factors like speed limit, road size, distance, etc. The path with the lowest score wins, and that’s what you end up getting for your directions. All you’d have to do to route around common DUI locations is make the number of historical DUIs along a street segment count in the routing algorithm’s calculation. Streets with lots of historical DUIs would be avoided in favor of side streets with fewer arrests. You’d avoid Geary Blvd and intersections like 16th St and Mission St.

It’s an easy app and the data is there for the taking. I’ll leave aside the question of whether the idea would work in terms of being effective at making drunk drivers avoid actual arrest. For argument’s sake, let’s assume that it would work, or that some other similar type of app could. It’s not an app I’d build, and I assume pretty much everyone understands the moral objection.

I don’t have any big moral takeaway or conclusion. On the one hand there are arguments that data and knowledge can never inherently be bad. Then there are arguments that this particular data (or at least specifically a DUI-avoiding directions app) would only be used to encourage drunk driving. I’m not going to make the DUI-avoiding mobile app, that goes way too far down the path of encouraging bad behavior. But it brings up a lot of interesting questions we need to think about as we’re working with data like this.

If San Francisco Crime were Elevation

I’ve been playing with different ways of representing data (see my previous night lights example) and I decided to venture into 3D representations. I’ve used a full year of crime data for San Francisco from 2009 to create these maps. The full dataset can be download from the city’s DataSF website.

A view from above

This view shows different types of crime in San Francisco viewed directly from above. The sun is shining from the east, as it would during sunrise.


I love how some of the features in these maps are pretty consistent across all the crime types, like the mountain ridge along Mission St., and how some of the features only crop up in one or two of the maps. The most unique map by far is the one for prostitution (more on that further down).

An alternate view

Here’s the same data but from a different angle, which helps show some of the differences.

UPDATE: Whoops, I screwed up originally and had a duplicate image. The original graphic showed the same map for Vandalism and Assault (both were the Vandalism map). This updated graphic has the correct map for Assault.


Many of the maps have peaks in the Tenderloin, which is that high area sort of in the north-east center area of the city. Some are extremely concentrated (narcotics) and some are far more spread out (vehicle theft).

My favorite map is the one for prostitution (maybe “favorite” is the wrong choice of words there). Nearly all the arrests for prostitution in San Francisco occur along what I’m calling the “Mission Mountain Ridge”, which runs up Mission St between 24th and 16th.

EDIT: I’ve been corrected. Upon closer inspection the prostitution arrests are peaking on Shotwell St. at the intersections of 19th and 17th. I’m sure the number of colorful euphemisms you can come up with that include the words “shot” and “well” are endless.

I love the way the mountain range casts a shadow over much of the city. There’s also a second peak in the Tenderloin (which I’m dubbing Mt. Loin).


Drug crimes are also interesting to look at, since so much of the drug activity in San Francisco is centered in a few distinct areas. We can see Mt. Loin rising high above all the other small peaks. The second highest peak is the 16th St. BART peak.


There are other consistent features in these maps, in addition to Mt. Loin and the Mission Range. There’s a valley that separates the peaks in the Mission and the peaks in the Tenderloin, which is where the freeway runs (Valley 101). You’ll also notice a division in many of the maps that separates the southeast corner. That’s the Hunter’s Point Riverbed (aka the 280 freeway).


These maps were generated from real data, but please don’t take them as being accurate. The data was aggregated geographically and artistically rendered. This is meant more as an art piece than an informative visualization.

Data Visualized as City Lights at Night

Images courtesy of the Image Science & Analysis Laboratory, NASA Johnson Space Center

As I was flying back home into San Francisco airport I was watching the city lights out the window and got struck by a bit of inspiration. I find cities beautiful, from the graffiti to the neon signs to the line of headlights on the highway. A city viewed from above at night is captivating. I wanted to try to recreate that same look, but by visualizing data (in one sense you can say that the real view of a city from above is already a visualization of population data).

I started searching for images of cities at night, and found these amazing images from NASA. All those images were taken from a space shuttle orbiting the earth. These images tell you a lot about the city, the layout, urban density, planning (or lack thereof). I wanted to take other meaningful data and create similar images.

All the visualizations below have been created with SpatialKey. However, this is some experimental work I’ve been playing with to generate the “night light” images, so it’s not released (and might not ever be). Basically this is a peak behind some of the R&D work I do for fun (yes, for a dataviz dork like me making fake “cities at night” images is my idea of fun).

Crime in San Francisco

This image is all crime in San Francisco for a 3-month period. You can see some of the same features that you can see in the NASA space image, such as Golgen Gate Park and the Presidio (the area on the north-west edge of the city). All in all it’s interesting how similar the crime image looks compared to the NASA image. Downtown is the brightest spot in both images, which means that it’s literally the brightest area of the city (the most streetlights), and also has the most crime.


And here are breakdowns for a few different crime types. Notice how different the distributions are. Narcotics crimes are heavily clustered and can be found downtown (in the Tenderloin), in the Mission (near the 16th St BART station), and along Haight Street near Golden Gate Park. Whereas vehicle theft is scattered fairly evenly throughout the city.

Graffiti Reports in San Francisco and New York

Both San Francisco and New York publish their 311 data, which is when citizens call for city services. One category of 311 calls is to report graffiti. Graffiti is interesting in that it often follows specific city streets. When we look at the graffiti data for both cities we see specific streets that have far more graffiti than others. I love these images (particularly the one of SF) because they really look like a view of street lights from a plane.


Trees planted in San Francisco

Another one of my favorites of this set is data for all the trees that the city of San Francisco has planted since 1990 (all this SF data is available at You can see the heavy planting along Market St (which cuts diagonally through downtown), as well as along streets like Sunset Blvd (the street running north/south on the western side of the city).


Street lights (or SF as a giant lite-brite)

One final image of San Francisco we have is the locations of every street light in the city. I liked this image because it reminded me of playing with a Lite-Brite when I was a kid. It almost makes city planning feel light a grown-up version of playing with little plastic lights.


New SpatialKey Crime Example for San Francisco

We just posted a new example of using SpatialKey to visualize crime in San Francisco. We load in 90 days of crime data from the city, then filter down to only include sales of heroin, crack cocaine, and methamphetamine within 1,000 feet of a school. Why those particular crimes around schools? The SFPD just launched a new initiative called “Operation Safe Schools” that specifically targets these drug crimes. If you’re caught dealing crack, heroin, or meth around a school while the school is in session you can get extra prison time.

Check out the video below and read the full article on the SpatialKey blog.

Read the full article on the SpatialKey blog to see how we put this together and learn more about the SFPD's "Operation Safe Schools." You can also watch the full resolution video on YouTube