Open source language and environment for analysis, statistics & visualization
Use data set “mpg”: (provided with R) Fuel economy data from 1999 and 2008 for 38 popular models of car
# list the structure of mpg
str(mpg)
Classes 'tbl_df', 'tbl' and 'data.frame': 234 obs. of 11 variables:
$ manufacturer: chr "audi" "audi" "audi" "audi" ...
$ model : chr "a4" "a4" "a4" "a4" ...
$ displ : num 1.8 1.8 2 2 2.8 2.8 3.1 1.8 1.8 2 ...
$ year : int 1999 1999 2008 2008 1999 1999 2008 1999 1999 2008 ...
$ cyl : int 4 4 4 4 6 6 6 4 4 4 ...
$ trans : chr "auto(l5)" "manual(m5)" "manual(m6)" "auto(av)" ...
$ drv : chr "f" "f" "f" "f" ...
$ cty : int 18 21 20 21 16 18 18 18 16 20 ...
$ hwy : int 29 29 31 30 26 26 27 26 25 28 ...
$ fl : chr "p" "p" "p" "p" ...
$ class : chr "compact" "compact" "compact" "compact" ...
# print mpg
mpg
# A tibble: 234 x 11
manufacturer model displ year cyl trans drv cty hwy
<chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int>
1 audi a4 1.8 1999 4 auto(l5) f 18 29
2 audi a4 1.8 1999 4 manual(m5) f 21 29
3 audi a4 2.0 2008 4 manual(m6) f 20 31
4 audi a4 2.0 2008 4 auto(av) f 21 30
5 audi a4 2.8 1999 6 auto(l5) f 16 26
6 audi a4 2.8 1999 6 manual(m5) f 18 26
7 audi a4 3.1 2008 6 auto(av) f 18 27
8 audi a4 quattro 1.8 1999 4 manual(m5) 4 18 26
9 audi a4 quattro 1.8 1999 4 auto(l5) 4 16 25
10 audi a4 quattro 2.0 2008 4 manual(m6) 4 20 28
# ... with 224 more rows, and 2 more variables: fl <chr>, class <chr>
# help on mpg
?mpg
ggplot(data = mpg) +
geom_point(mapping = aes(x=displ,y=hwy),size=8) +
theme_bw(base_size = 40)
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, color = class), size = 8) +
theme_bw(base_size = 40)
ggplot(data = mpg) +
geom_point(mapping = aes(x=displ,y=hwy), size = 8) +
facet_wrap(~class, nrow = 2) +
theme_bw(base_size = 40)
ggplot(data = mpg) +
geom_point(mapping = aes(x=displ,y=hwy, color = manufacturer=="subaru"), size = 8) +
theme_bw(base_size = 40) +
facet_wrap(~class, nrow = 2) +
scale_colour_manual(values=c("#000000", "#FF0000"),name="Subaru") +
labs(title = "Modify plot labels, title, legend", x = "engine displacement [l]", y = "highway [mpg]") +
theme(legend.justification=c(1,0), legend.position=c(1,0))
Grossman-Clarke S. et al. 2017. International Journal of Climatology 37(2): 905–917, doi: 10.1002/joc.4748.
Grossman-Clarke S. et al. 2017. International Journal of Climatology 37(2): 905–917, doi: 10.1002/joc.4748.
RStudio is an integrated development environment (IDE) for R.
Three steps for installing RStudio
Download the binary setup file for R for your operating system from CRAN: http://cran.r-project.org/:
Open the downloaded file .exe (windows) or .pkg (macosx) and install following instructions
Download and install the free Desktop R Studio version from www.rstudio.com here:
http://www.rstudio.com/products/rstudio/download/
READY!
OPEN R Studio by clicking on the prompt!
What is an R-package?
Packages are collections of R functions, example data, and compiled code.
Choose CRAN mirror from which to download packages:
Main menu -> RStudio -> Preferences -> Packages
Main menu -> Tools -> Install packages
Main menu -> Tools -> Check for package updates
IMPORTANT!
In order to use a non-standard R package it needs to be loaded in each new R session via console or included in a script:
# load package (use pound key for comments)
library(ggplot2)
tutorial_dat<-read.csv('data1_r_intro.csv')
# print structure of data set "tutorial_dat"
str(tutorial_dat)
'data.frame': 3290 obs. of 41 variables:
$ TIMESTAMP : Factor w/ 3290 levels "2015-07-09 13:00:00",..: 1 2 3 4 5 6 7 8 9 10 ...
$ RECORD : int 0 1 2 3 4 5 6 7 8 9 ...
$ VW_Avg : num 0.412 0.411 0.411 0.413 0.414 0.415 0.415 0.415 0.414 0.413 ...
$ VW_2_Avg : num 0.42 0.421 0.421 0.422 0.422 0.422 0.423 0.423 0.424 0.424 ...
$ VW_3_Avg : num 0.298 0.298 0.298 0.298 0.298 0.298 0.298 0.298 0.298 0.298 ...
$ AirTC_Avg : num -28 35.9 37.2 37.8 38 ...
$ RH_Avg : num 13 30.8 27.7 26 25 ...
$ AirTC_2_Avg : num -11.1 35.6 36.8 37.4 37.7 ...
$ RH_2_Avg : num 16.2 29.3 26.5 24.9 24.2 ...
$ AirTC_3_Avg : num -21.9 35.5 36.3 36.9 37.1 ...
$ RH_3_Avg : num 13.7 27.8 25.6 24.2 23.4 ...
$ PPFin_Avg : num 2218 1611 1969 1601 1773 ...
$ ndvi_Jenkins_Avg : num 0.516 0.501 0.505 0.515 0.525 0.534 0.541 0.551 0.553 0.551 ...
$ ndvi_Huemmrich_Avg : num 0.539 0.531 0.533 0.547 0.551 0.559 0.564 0.575 0.579 0.579 ...
$ ndvi_Wilson_Avg : num 0.492 0.477 0.481 0.492 0.501 0.51 0.518 0.527 0.529 0.527 ...
$ evi2_Avg : num 0.325 0.318 0.333 0.335 0.343 0.352 0.36 0.373 0.378 0.373 ...
$ Rain_mm_Tot : num 0 0 0 0 0 0 0 0 0 0 ...
$ Temp_C_Avg : num 30.4 30.7 30.5 30.9 31.3 ...
$ PTemp_C_Avg : num 38.8 38.9 39 40.7 41.8 ...
$ shf_Avg : num 4.42 7.79 8.95 10.89 11.63 ...
$ BP_mbar_Avg : int 951 962 962 962 961 961 960 960 960 960 ...
$ Batt_Volt_Min : num 12.8 12.8 12.8 12.9 12.9 ...
$ short_up_Avg : num 985 725 893 732 826 ...
$ short_dn_Avg : num 187 136 174 134 151 ...
$ long_up_Avg : num -96.8 -87.8 -106.8 -99.1 -100.4 ...
$ long_dn_Avg : num 27.86 4.62 22.46 21.89 25.06 ...
$ cnr4_T_C_Avg : num 38.2 36.9 38.1 38.6 38.5 ...
$ long_up_corr_Avg : num 436 436 426 436 435 ...
$ long_dn_corr_Avg : num 561 528 555 557 560 ...
$ Rs_net_Avg : num 798 589 719 598 675 ...
$ Rl_net_Avg : num -124.7 -92.5 -129.3 -121 -125.4 ...
$ albedo_Avg : num 0.187 0.175 0.192 0.177 0.18 ...
$ Rn_Avg : num 673 496 590 477 550 ...
$ TT_C : num 36 37.3 38.2 37.4 37.2 ...
$ SBT_C : num 37.2 35.8 37.3 37.5 38.3 ...
$ wnd_dir_compass_Avg: num 0 1.32 237.32 237.27 228.37 ...
$ Rainfall : num 0 0 0 0 0 0 0 0 0 0 ...
$ H : num NA NA 186 249 203 ...
$ LE : num NA NA 171 137 154 ...
$ C : num NA NA 0.0571 0.0822 0.0252 ...
$ G_calc : num 4.42 7.7 8.95 11.05 11.72 ...
data set name$variable name
# print specific variable AirTC_Avg of data set "tutorial_dat"
tutorial_dat$AirTC_Avg[1:100]
[1] -28.01 35.89 37.16 37.77 38.04 38.50 38.29 38.06 37.70 37.21
[11] 36.64 35.91 35.38 34.60 33.75 33.03 32.25 31.53 29.91 28.28
[21] 27.40 28.57 27.97 24.90 21.75 23.12 23.15 23.04 22.93 23.06
[31] 22.09 20.05 21.37 20.72 21.69 23.42 25.45 25.39 26.60 29.92
[41] 32.55 33.85 34.62 35.39 36.20 36.40 36.60 37.13 37.94 37.10
[51] 38.48 38.94 38.18 38.42 38.93 38.92 38.18 38.20 37.20 36.50
[61] 35.72 35.19 34.56 33.89 33.19 32.60 31.91 31.34 30.59 30.38
[71] 29.59 28.02 26.93 26.83 26.49 26.75 26.93 26.97 26.82 25.78
[81] 25.22 25.07 25.96 27.61 28.96 30.13 31.67 32.58 34.11 35.30
[91] 35.83 36.51 37.14 37.46 38.30 38.76 38.94 38.96 39.77 39.50
ggplot(data=tutorial_dat) +
geom_histogram(mapping = aes(x=AirTC_Avg,fill=Rainfall>0)) +
theme_bw(base_size = 40)
field<-read.csv('data2_r_intro.csv')
# print structure of data set "field"
str(field)
'data.frame': 807 obs. of 10 variables:
$ Longitude : num -112 -112 -112 -112 -112 ...
$ Latitude : num 33.1 33.1 33.1 33.1 33.1 ...
$ Fix.Type : int 2 2 2 2 2 2 2 2 2 2 ...
$ UTC.Time : int 182427 182427 182427 182427 182428 182428 182428 182428 182428 182429 ...
$ Logger.Time: int 136200 136400 136600 136800 137000 137200 137400 137600 137800 138000 ...
$ Config. : int 49 49 49 49 49 49 49 49 49 49 ...
$ Count : int 681 682 683 684 685 686 687 688 689 690 ...
$ NDVI : num 0.652 0.782 0.758 0.579 0.996 0.669 0.669 0.769 0.717 0.723 ...
$ NIR : num -0.0026 -0.0022 -0.0025 -0.002 -0.0025 -0.0024 -0.0014 -0.003 -0.0026 -0.0032 ...
$ Red : num 0 0 0 0 0 0 0 0 0 0 ...
ggplot(data=field) +
geom_point(mapping = aes(x=Latitude,y=Longitude,color=NDVI), size=6) +
scale_color_gradient(low="blue", high="yellow") +
theme_bw(base_size = 40)
Documentation
Cheat sheets and reference sheets
Mailing lists and help pages
Free online tutorials
Books
Nice overview is given on the National Center for Ecological Analysis and Synthesis’ webpage: https://www.nceas.ucsb.edu/scicomp/software/r