Data can often tell you far more about people than you originally think. In my previous post I presented some of the data from the history of the FlexCoders mailing list. I showed some of the details of the individual usage patterns for different people. One of those people was the Flex product manager, Matt Chotin. Matt’s involvement with FlexCoders is pretty interesting if you start to dig into the data. In this post I’ll try to identify some changing trends in his usage patterns and we’ll see if we can do some detective work to figure out why his behavior changed.
A little background: Matt has been involved in Flex since basically forever. He was an engineer at Macromedia and is now the product manager for Flex. Matt has been quite prolific on flexcoders over the years (in the overall ranking he’s #3). So to start I was interested in his overall post volume on the list. Take a look at the timeline showing his posts per month and you’ll notice there’s a distinct drop-off:
Here’s a closeup of a period:
See that big drop from April to May of 2006? Well in May Matt changed jobs to become the product manager of Flex. On his blog he noted:
So if you notice the number of flexcoders posts going down it’s because my brain will be slowly atrophying as I move away from the details of our vast offering.
And that’s exactly what happened.
Seeing the correlation between a change in professional life and a drop in activity is cool, but we can dig deeper. Not only is this data telling us when Matt changed his behavior throughout the year, but we can also figure out something about his daily routines and how that changed as well. I started looking at when (as in what time of day) Matt was posting to the list.
Here’s a chart that shows the distribution of posts by hour of day and day of week. It groups the posts by the combination of what day and what hour they occur on.
So you can see that Matt posted the most on weekday mornings (around 9-11am on Monday-Friday) and weekday evenings (around 8-10pm Monday-Thursday, note that he rarely posts on Friday nights).
This pattern is actually very similar to Alex Harui’s activity as well, although Alex’s activity is more weighted to during work hours than at night (except for Sunday night!).
I found the evening hotspots interesting (both in Matt and Alex’s cases). Clearly Matt was answering people’s questions a lot after work hours from home.
I dug a bit further into Matt’s trends. Here’s the graph of his activity by hour of day for 2005:
We can see in 2005 he actually answered more questions in the evening than in the morning. Taking a look at 2006 this became even more pronounced, almost all his activity was at night (I wasn’t the only one who noticed this, see Ryan Stewart’s post about Matt posting at 9pm):
And then there was a change in 2007. The graph for 2007 shows that he started answering more question during the workday. And that shift continued into 2008 and 2009, by which time almost all of Matt’s activity was during work hours.
If you dig even deeper into the data you can find out that the transition from mainly evening activity to work-day activity happened mostly during the months of April 2007 – June 2007. After about July 2007 Matt almost primarily posts during the day. Taking a look at the release history of Flex, we see that the beta of Flex 3 came out in June 2007. So my guess is that Matt changed to a management role in May of 2006, but had far too much work to do to get Flex 3 ready and out the door between then and June 2007 (meaning his devotion to flexcoders had to be delegated to the evening hours). Finally once the Flex 3 beta was out the door he could devote some actual work hours to being involved in the community, instead of having to do it all from home.
As if knowing the intimate details about Matt’s daily routine isn’t enough, we can learn something about his historical vacation time off as well. Matt’s impressive in that he’s never missed a month without posting. If you go even more granular there are actually very few weeks that he missed (as his overall activity declined in 2009 this became more common). So if we look at Matt’s activity around the holidays something interesting pops out (well, it’s only interesting if you’re a total stalker, but if you’ve read this far then you probably are). Here are a few timelines of different years, showing columns grouped by week. In 2005 we see Matt was posting pretty regularly through the holidays. There actually was a 5 day stretch with no posts, but that was it (due to the way the weeks are grouped that gap doesn’t show in this chart).
2005 is similar:
But then 2006 has a big gap:
And being the stalker that I am, I noticed that and then went to investigate further. Turns out Matt wrote about taking a vacation that year.
I’ll be on vacation until mid-January so emails to me will go unanswered as will responses to various forums and blog comments Happy Holidays to all!
The data never lies.
Looks like a long vacation over the holidays didn’t turn into a regular thing though, since he was right back at it the following year:
I’m not a total nut job
I know it seems like I’m obsessed with Matt Chotin. And regardless of whether that’s true or not, I do want to assure people I’m not totally off my rocker. This little experiment in data mining and analysis isn’t really about Matt. It’s about the stories data tells about all of us. There are mountains of public information out there about us all, and the tiny little bits that we put out there, even if those are just little Facebook or Twitter status messages, can say a lot about us. Sure, a single Facebook status message doesn’t tell anyone much, but when you look at all of them over a multi-year period you can start learning a lot about a person. And often that information that the aggregate data tells about us isn’t something we’re aware of. From this data experiment I know when Matt eats dinner (pretty typical range of 6-8pm), when he goes to bed (around midnight), and when he gets to work (again pretty normal between 8-9). And this is all from only 4,000 data points. With social networking and microblogging sites we’re starting to create thousands of little data points like this all the time.
Thanks to Matt Chotin
I ran this post by Matt first, since I know it’s a bit creepy. He was cool with me posting it, so thanks Matt! And thanks for all the years of hard work answering questions on flexcoders, we’re a stronger community because of it.
Read more about the data here. This is 5 and a half years of mailing list activity, comprising about 148,826 individual email messages. Matt himself posted about 4,000 messages. You can download the full CSV dataset here.