Last updated: 2017-07-05
Code version: 5e53297
I begin by loading a few packages, as well as some additional functions I wrote for this project, into the R environment.
library(data.table)
source("../code/functions.R")
I wrote a function, read.divvy.data
, that reads in the trip and station data from the CSV files downloaded from the Divvy website. This function uses fread
from the data.table
package to quickly read in the data (it is much faster than read.table
). This function also prepares the data, notably the dates and times, so that they are easier to work with.
divvy <- read.divvy.data()
# Reading station data from ../data/Divvy_Stations_2016_Q4.csv.
# Reading trip data from ../data/Divvy_Trips_2016_Q1.csv.
# Reading trip data from ../data/Divvy_Trips_2016_04.csv.
# Reading trip data from ../data/Divvy_Trips_2016_05.csv.
# Reading trip data from ../data/Divvy_Trips_2016_06.csv.
# Reading trip data from ../data/Divvy_Trips_2016_Q3.csv.
# Reading trip data from ../data/Divvy_Trips_2016_Q4.csv.
# Preparing Divvy data for analysis in R.
# Converting dates and times.
We have information on 581 Divvy stations across the city of Chicago.
nrow(divvy$stations)
# [1] 581
head(divvy$stations)
# name latitude longitude dpcapacity online_date
# 456 2112 W Peterson Ave 41.99118 -87.68359 15 5/12/2015
# 101 63rd St Beach 41.78102 -87.57612 23 4/20/2015
# 109 900 W Harrison St 41.87468 -87.65002 19 8/6/2013
# 21 Aberdeen St & Jackson Blvd 41.87773 -87.65479 15 6/21/2013
# 80 Aberdeen St & Monroe St 41.88042 -87.65560 19 6/26/2013
# 346 Ada St & Washington Blvd 41.88283 -87.66121 15 10/10/2013
In 2016, people took over 3 million trips on Divvy bikes.
nrow(divvy$trips)
# [1] 3595383
head(divvy$trips)
# trip_id starttime bikeid tripduration from_station_id
# 1 9080551 2016-03-31 23:53:00 155 841 344
# 2 9080550 2016-03-31 23:46:00 4831 649 128
# 3 9080549 2016-03-31 23:42:00 4232 210 350
# 4 9080548 2016-03-31 23:37:00 3464 1045 303
# 5 9080547 2016-03-31 23:33:00 1750 202 334
# 6 9080546 2016-03-31 23:31:00 4302 638 67
# from_station_name to_station_id
# 1 Ravenswood Ave & Lawrence Ave 458
# 2 Damen Ave & Chicago Ave 213
# 3 Ashland Ave & Chicago Ave 210
# 4 Broadway & Cornelia Ave 458
# 5 Lake Shore Dr & Belmont Ave 329
# 6 Sheffield Ave & Fullerton Ave 304
# to_station_name usertype gender birthyear start.week
# 1 Broadway & Thorndale Ave Subscriber Male 1986 13
# 2 Leavitt St & North Ave Subscriber Male 1980 13
# 3 Ashland Ave & Division St Subscriber Male 1979 13
# 4 Broadway & Thorndale Ave Subscriber Male 1980 13
# 5 Lake Shore Dr & Diversey Pkwy Subscriber Male 1969 13
# 6 Broadway & Waveland Ave Subscriber Male 1991 13
# start.day start.hour
# 1 Thursday 23
# 2 Thursday 23
# 3 Thursday 23
# 4 Thursday 23
# 5 Thursday 23
# 6 Thursday 23
Out of all the Divvy stations in Chicago, the one on Navy Pier (at Streeter and Grand) had the most activity.
counts <- table(divvy$trips$from_station_name)
as.matrix(head(sort(counts,decreasing=TRUE)))
# [,1]
# Streeter Dr & Grand Ave 90042
# Lake Shore Dr & Monroe St 51090
# Theater on the Lake 47927
# Clinton St & Washington Blvd 47125
# Lake Shore Dr & North Blvd 45754
# Clinton St & Madison St 41744
I will also take a close look at trip data for the main Divvy station on the University of Chicago campus, since that is where I work.
sum(divvy$trips$from_station_name == "University Ave & 57th St")
# [1] NA
This is the version of R and the packages that were used to generate these results.
sessionInfo()
# R version 3.3.2 (2016-10-31)
# Platform: x86_64-apple-darwin13.4.0 (64-bit)
# Running under: macOS Sierra 10.12.5
#
# locale:
# [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#
# attached base packages:
# [1] stats graphics grDevices utils datasets methods base
#
# other attached packages:
# [1] data.table_1.10.4
#
# loaded via a namespace (and not attached):
# [1] backports_1.0.5 magrittr_1.5 rprojroot_1.2 tools_3.3.2
# [5] htmltools_0.3.6 yaml_2.1.14 Rcpp_0.12.11 stringi_1.1.2
# [9] rmarkdown_1.6 knitr_1.16 git2r_0.18.0 stringr_1.2.0
# [13] digest_0.6.12 evaluate_0.10.1
This R Markdown site was created with workflowr