Last night we had a great meet up of the Montreal R User Group. I got things started with a little presentation asking the question “What does R do?” (slides). I made the presentation using Montreal R User Group member Ramnath Vaidyanathan‘s Slidify package. Slidify allows you to generate rather handsome HTML5 slides directly using R markdown.
We were then treated to a great workshop by Etienne Low-Decarie. He gave us a fly over of some of the most powerful R packages for wrangling data, namely plyr, reshape and ggplot.
Here are Etienne’s slides.
You can also follow along with the code posted here.
We met a lot of people who are doing very cool things using R. I’m looking forward to our next meetup!
I’m no shutterbug – drop me a note if you came and have any better pictures.
We had clementines!
Also, thanks to Notman House for hosting us. The haunted house feeling wasn’t enough to scare off this hardy group of data geeks.
At last weekend’s Hack Ta Ville event here in Montreal, I joined up with some talented urban planners and web devs to realize Vélobstacles. The idea of the project is to crowd source information on cycling conditions around the city. As with any crowd sourcing project, we were faced with the problem of seeding the site with some data to draw the attention of users to get the ball rolling.
Fortunately, we had access to a data set of all reported cycling accidents between 2006-2010. Once we seeded Vélobstacles with this data, the web devs went to town adding features to the site, and I had outlived my usefulness as a data geek. So I decided to play with the accident data a little and produce some visualization. I plotted all the accidents on a map and animated it through time. I also calculated and plotted the monthly accident rate using a moving average.
Be sure to select HD quality:
Not surprisingly, the accident rate goes way up in the summer months as Montreal winters are braved on two wheels by only a rarefied few. What is interesting is the mid-summer dip in the accident rate. This dip is notably correlated with Montreal’s much beloved construction holiday – though the causal relationship is unclear. If you have any alternative explanations, or an idea about how to test the construction holiday hypothesis, drop a note in the comments.
As always, you can get the code on my github page.
As September draws nearer, my mind inevitably turns away from my lofty (and largely unmet) summer research goals, and toward teaching. This semester I will be trying out a teaching technique using live data collection and analysis as a tool to encourage student engagement. The idea is based on the electronic polling technology known as ‘clickers‘. The technology allows you to get instant feedback from students, check for understanding, and when used appropriately it can facilitate active engagement and peer learning.
Because I will be teaching in a computer lab, where all of the students will be sitting at a computer, I have the advantage of being able to bypass the little devices, and instead gather student responses using a web based interface. The advantages, as I see them, are:
- Students can enter more complex input than the 1-9 provided by clickers. Instead, students can enter any number or character vector response.
- Students can instantly download, plot, and analyze the class data. This step is facilitated by the
read.csv("http://data_url.csv") function in R, which allows data import directly from the web.
The first exercise I have planned using this technology is to have students enter their height, then have them plot a histogram of the data to introduce the normal distribution. Using the simple online interface I have created, this exercise can be done very quickly. I am calling the tool I am one of n.
If you have any suggestions for learning activities that could make effective use of this technology in an undergraduate Biostatistics (or other) course, drop me a note!