Back in 2013, David had done analysis of bicycle trips across Seattle’s Fremont bridge. More recently, Jake Vanderplas (creator of Python’s
very popular Scikit-learn package) wrote a nice blog post on
“Learning Seattle Work habits from bicycle counts” at Fremont bridge.
I wanted to work through Jake’s analysis using R since I am learning R. Please read the original article by Jake to get the full context and thinking behind the analysis. For folks interested in Python, Jake has provided link in the blog post to iPython notebook where you can work through the analysis in Python (and learn some key Python modules: pandas, matplotlib, sklearn along the way)
The R code that I used to work through the analysis is in the following link.
Below are some key results/graphs.
1. Doing a PCA analysis on bicycle count data (where each row is a day and columns are 24 hr bicycle counts from East and West side) shows that 2 components can explain 90% of variance. The scores plot of first 2 principal components indicates 2 clusters. Coloring the scores by day of week suggest a weekday cluster and a weekend cluster
- The average bike counts for each cluster and side (East/West) better shows the patterns for weekday and weekend commute (weekday commute peaks in the morning and evening)
Thanks again to Jake Vanderplas for the analysis and illustrating how lot of insights could be gathered from data.