Last updated: 2018-09-05
workflowr checks: (Click a bullet for more information) ✔ R Markdown file: up-to-date
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
✔ Environment: empty
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
✔ Seed:
set.seed(20180820)
The command set.seed(20180820)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
✔ Session information: recorded
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
✔ Repository version: da82ab8
wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .DS_Store
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: analysis/.DS_Store
Ignored: data/.DS_Store
Ignored: docs/.DS_Store
Untracked files:
Untracked: .Rbuildignore
Untracked: analysis/assets/
Untracked: analysis/mapping.Rmd
Untracked: data-raw/
Untracked: data/csv/
Untracked: data/raster/
Untracked: data/sf/
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | da82ab8 | annakrystalli | 2018-09-05 | workflowr::wflow_publish(“analysis/gis.Rmd”) |
html | c8c9096 | annakrystalli | 2018-09-05 | Build site. |
Rmd | eb17960 | annakrystalli | 2018-09-05 | workflowr::wflow_publish(“analysis/gis.Rmd”) |
html | 4c6e18d | annakrystalli | 2018-09-05 | Build site. |
Rmd | 496feae | annakrystalli | 2018-09-05 | workflowr::wflow_publish(“analysis/gis.Rmd”) |
html | d6d0523 | annakrystalli | 2018-09-05 | Build site. |
Rmd | 06f1792 | annakrystalli | 2018-09-05 | workflowr::wflow_publish(“analysis/gis.Rmd”) |
html | f22d555 | annakrystalli | 2018-09-05 | Build site. |
Rmd | 7403f6b | annakrystalli | 2018-09-05 | workflowr::wflow_publish(“analysis/gis.Rmd”) |
html | 80b8fec | annakrystalli | 2018-09-04 | Build site. |
Rmd | 1a58bc5 | annakrystalli | 2018-09-04 | workflowr::wflow_publish(“analysis/gis.Rmd”) |
Geographic Information Systems (GIS): computer based systems used to:
- collect
- store
- manage
- analyse
- display spatial or geographic data
In science, GIS allow us to examine spatial realationships, patterns and processes.
Applied to a broad range of phenomena: from physical, biological to cultural and demographic.
Most common spatial data types:
Vector data tends to dominate the social sciences because human settlements tend to have discrete borders.
Raster often dominates in environmental sciences because of the reliance on remote sensing data.
There is much overlap in some fields and raster and vector datasets can be used side-by-side: ecologists and demographers, for example, commonly use both vector and raster data.
Ultimately they are just data with geographic metadata attached to it in a structured way that both locates the data and also makes it interoperable. One of the most important aspects of this georgraphic metadata is the Coordinate Reference System being used.
Coordinate Reference Systems (CRS) provide a framework for defining real-world locations.
The basis for any Coordinate reference system is a model of the earth (Globe Model) on which locations can be defined.
We can think of the differences between Globe Models as being analogous to differences in globes you’ve actually seen in your life – some are perfect balls, while others are spheres with elevations, while others capture the full weirdness of the earth’s shape.
Image source: http://www.nickeubank.com/wp-content/uploads/2018/02/CoordinateSystems_And_Projections.pdf
There are two types of Coordinate Reference Systems:
Geographic Coordinate Systems are defined by a Datum
A Datum essentially defines the Globe Model of the coordinate system.
It has:
A point is referenced by its longitude and latitude values. Longitude and latitude are angles measured from the earth’s center to a point on the earth’s surface.
Image source: https://datacarpentry.org/r-raster-vector-geospatial/images/map_usa_different_projections.jpg
A particular datum can be selected for use in a specific geographic area, because it does a good job of modelling the surface of the earth in that part of the world.
Image source: http://desktop.arcgis.com/en/arcmap/10.3/guide-books/map-projections/GUID-E94B3A5F-D997-4E6F-B400-BC15C4B0DCDC-web.gif
Orange Peel Analogy: a datum is your choice of fruit to use in the orange peel analogy. Is the earth an orange, a lemon, a lime, a grapefruit?
Image source: https://farm3.staticflickr.com/2260/2508805118_500f5bba28_n.jpg
To change between geographic coordinate system, you may need to perform a geographic (datum) transformation. This is a well-defined mathematical method to convert coordinates between two geographic coordinate systems.
Projected coordinate systems include a projection
A Projection is a mathematical transformation of the angular measurements on a round earth to a flat surface (i.e. paper or a computer screen).
Units of measurement are linear, usually meters.
Orange Peel Analogy: a projection is how you peel your orange and then flatten the peel.
Image source: http://blogs.lincoln.ac.nz/gis/2017/03/29/where-on-earth-are-we/
-Mercator projection: a cylindrical map projection presented by the Flemish geographer and cartographer Gerardus Mercator in 1569. It became the standard map projection for nautical purposes because of its ability to represent lines of constant course as straight segments that conserve the angles with the meridians.
Universal Transverse Mercator (UTM) system: The UTM system is not a single map projection. The system instead divides the Earth into sixty zones, each being a six-degree band of longitude, and uses a secant transverse Mercator projection in each zone.
Albers Equal Area: a conic, equal area map projection that uses two standard parallels. Although scale and shape are not preserved, distortion is minimal between the standard parallels.
Image source: https://datacarpentry.org/r-raster-vector-geospatial/images/map_usa_different_projections.jpg
Latitude lines are concentric circles that become smaller near the poles. They form a single point at the poles.
When a lat/lon CRS is flattened, the implicit flattening function used is the Plate Carree Function, one of the simplest functions that simply turns the globe into a cylinder and then unwraps it, resulting in increasing distortion of features as one approaches the poles.
Going back to the orange peel analogy, different projections use a different starting point and method to peel the earth’s surface into a flat surface.
To change between projected CRSs you need to reproject your data. This often involves unprojecting back to lat/lon, transforming the unprojected coordinates and then reprojecting.
You have two options for identifying a CRS in most R commands. The documentation for a command that requires projection information will tell you which is required. Often you can choose between the two options.
An EPSG (European Petroleum Survey Group) Code is an ID that has been assigned to most common projections to make reference to a particular projection easy. An EPSG Code is also called an SRID (Spatial Reference Identifier). Technically, EPSG is the authority that assigns SRIDs, but you will hear these terms used interchangibly.
The main advantages to using this method of specifying a projection are that it is standardized and ensures you have the same parameters every time.
The disadvantage is that if you need to know the parameters used by the projection or it’s name, you have to look them up, but that’s fairly easy to to at spatialreference.org. Also, you can’t customize the parameters if you use an EPSG code.
PROJ.4 is an open source library for defining and converting between coordinate reference systems. It defines a standard way to write projection parameters.
For example, the proj.4 string: +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
encodes:
Note that there are no specified units above. This is because this geographic coordinate reference system is in latitude and longitude which is most often recorded in decimal degrees.
Two important advantages to using this option are:
The main disadvantage to this option is that it’s easy to make a mistake when you reproduce the string, accidentally changing parameters.
For more, check out Proj4 FAQs
You may find yourself in a situatation were you’ll need to tell your GIS (in our case R) what the projection/CRS of your data should be. In these cases, the GIS needs to know what the projection/CRS currently is, not what you would like it to be. If you need to change a projection, you need to go through a different process, often called Re-project or Transform.
The city of London can be represented by the coordinates c(-0.1, 51.5). This means that its location is -0.1 degrees east and 51.5 degrees north of the origin. The origin in this case is at 0 degrees longitude (the Prime Meridian) and 0 degree latitude (the Equator) in a geographic (‘lon/lat’) coordinate system (GCS).
The same point could also be approximated in a PCS with ‘Easting/Northing’ values of c(530000, 180000) in the British National Grid (BNG), meaning that London is located 530 km East and 180 km North of the origin of the CRS (right panel).
The location of BNG’s origin, in the sea beyond South West Peninsular, ensures that most locations in the UK have positive Easting and Northing values.
What's the one thing you want people who are learning to handle projections for spatial data in #rstats (or in general) to know? #gistribe
— Dr. Michele M Tobias (@MicheleTobias) January 23, 2018
sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.3
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] workflowr_1.0.1 Rcpp_0.12.18 digest_0.6.15
[4] rprojroot_1.3-2 R.methodsS3_1.7.1 backports_1.1.2
[7] git2r_0.21.0 magrittr_1.5 evaluate_0.11
[10] stringi_1.2.4 whisker_0.3-2 R.oo_1.21.0
[13] R.utils_2.6.0 rmarkdown_1.10 tools_3.4.4
[16] stringr_1.3.1 yaml_2.1.19 compiler_3.4.4
[19] htmltools_0.3.6 knitr_1.20
This reproducible R Markdown analysis was created with workflowr 1.0.1