# Coordinate Systems in R

## Sample files for this exercise

Data used in this exercise can be loaded into your current R session by running the following chunk of code.

load(url("http://github.com/mgimond/Spatial/raw/master/Data/Sample1.RData"))

Only two data layers will be used in this exercise: a Maine counties polygon layer (s.sf) and an elevation raster layer (elev.r). The former is in an sf format and the latter is in a raster format. You can remove the other data objects from your environment via:

rm(list=c("inter.sf", "p.sf", "rail.sf"))

## Checking for a coordinate system

To extract coordinate system (CS) information from an sf object use st_crs from the sf package; for a raster object use the crs function from the raster package.

library(sf)
st_crs(s.sf)
Coordinate Reference System:
EPSG: 26919
proj4string: "+proj=utm +zone=19 +datum=NAD83 +units=m +no_defs"
library(raster)
crs(elev.r)
CRS arguments:
+proj=utm +zone=19 +datum=NAD83 +units=m +no_defs +ellps=GRS80
+towgs84=0,0,0 

There are two ways of defining a coordinate system: via the EPSG numeric code or via the PROJ4 formatted string. The EPSG code may not always be available for a particular coordinate system, but if a spatial object has a defined coordinate system, it will always have a PROJ4 projection string. Its multi-parameter syntax is briefly discussed next.

## Understanding the Proj4 coordinate syntax

The PROJ4 syntax consists of a list of parameters, each prefixed with the + character. For example, elev.r’s CS is in a UTM projection (+proj=utm) for zone 19 (+zone=19) and in an NAD 1983 datum (+datum=NAD83). Other bits of information that can be gleaned from the projection string are the units (meters) and the underlying ellipsoid (GRS80).

A list of a few of the PROJ4 parameters used in defining a coordinate system follows. Click here for a full list of parameters.

+a         Semimajor radius of the ellipsoid axis
+b         Semiminor radius of the ellipsoid axis
+datum     Datum name
+ellps     Ellipsoid name
+lat_0     Latitude of origin
+lat_1     Latitude of first standard parallel
+lat_2     Latitude of second standard parallel
+lat_ts    Latitude of true scale
+lon_0     Central meridian
+over      Allow longitude output outside -180 to 180 range, disables wrapping
+proj      Projection name
+south     Denotes southern hemisphere UTM zone
+units     meters, US survey feet, etc.
+x_0       False easting
+y_0       False northing
+zone      UTM zone

You can view the list of available projections +proj= here.

## Assigning a coordinate system

A coordinate system definition can be passed to a spatial object. It can either fill a spatial object’s empty CS definition or it can overwrite and existing definition (the latter should only be executed if there is good reason to believe that the original definition is erroneous). Note that this step does not change an objects underlying coordinate system (this option will be discussed in the next section).

We’ll pretend that a CS definition was not assigned to s.sf and assign one manually using the st_set_crs() function.

s.sf <- st_set_crs(s.sf, "+proj=utm +zone=19 +ellps=GRS80 +datum=NAD83") 

To do the same with a raster object simply assign the PROJ4 string to the crs() function as follows (here too we’ll assume that the spatial object had a missing reference system or an incorrectly defined one).

crs(elev.r) <- "+proj=utm +zone=19 +ellps=GRS80 +datum=NAD83"

Note that we do not need to define all of the parameters so long as we know that the default values for these unused parameters are correct. Also note that we do not need to designate a hemisphere since the NAD83 datum applies only to North America.

To recreate a CS defined in a software such as ArcGIS, it is best to extract the CS’ WKID/EPSG code then use that number to look up the PROJ4 syntax on http://spatialreference.org/ref/. For example, in ArcGIS, the WKID number can be extracted from the coordinate system properties output.

That number can then be entered in the http://spatialreference.org/ref/’s search box to pull the Proj4 parameters (note that you must select Proj4 from the list of syntax options).

Here are examples of a few common projections:

Projection WKID Authority Syntax
UTM NAD 83 Zone 19N 26919 EPSG +proj=utm +zone=19 +ellps=GRS80 +datum=NAD83 +units=m +no_defs
USA Contiguous albers equal area 102003 ESRI +proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=37.5 +lon_0=-96 +x_0=0 +y_0=0 +ellps=GRS80 +datum=NAD83 +units=m +no_defs
Alaska albers equal area 3338 EPSG +proj=aea +lat_1=55 +lat_2=65 +lat_0=50 +lon_0=-154 +x_0=0 +y_0=0 +ellps=GRS80 +datum=NAD83 +units=m +no_defs
World Robinson 54030 ESRI +proj=robin +lon_0=0 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs

## Transforming coordinate systems

The last step showed you how to define or modify the coordinate system definition. This section shows you how to transform the coordinate values associated with the spatial object to a different coordinate system. For example, to transform the s.sf vector object to a geographic (lat/long) coordinate system, we’ll use sf’s st_transform function.

s.sf.gcs <- st_transform(s.sf, "+proj=longlat +datum=WGS84")
st_crs(s.sf.gcs)
Coordinate Reference System:
EPSG: 4326
proj4string: "+proj=longlat +datum=WGS84 +no_defs"

The raster object equivalent is:

elev.r.gcs <- projectRaster(elev.r, crs="+proj=longlat +datum=WGS84")
crs(elev.r.gcs)
CRS arguments:
+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 

A geographic coordinate system is often desired when overlapping a web based mapping service such as Google, Bing or OpenStreetMap. To check that s.sf.gcs was properly transformed, we’ll overlay it on top of an OpenStreetMap using the leaflet package.

library(leaflet)
leaflet(s.sf.gcs) %>%
addTiles()

Next, we’ll explore other transformations using a tmap dataset of the world

library(tmap)
data(World)  # The data is stored as an sf object

# Let's check its current coordinate system
st_crs(World)
Coordinate Reference System:
No EPSG code
proj4string: "+proj=eck4 +lon_0=0 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs"

The following chunk transforms the world map to an Azimuthal equidistant projection centered on latitude 0 and longitude 0.

World.ae <- st_transform(World, "+proj=aeqd +lat_0=0 +lon_0=0 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs")

tm_shape(World.ae) + tm_fill() 

The following chunk transforms the world map to an Azimuthal equidistant projection centered on Maine.

World.aemaine <- st_transform(World, "+proj=aeqd +lat_0=44.5 +lon_0=-69.8 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs")

tm_shape(World.aemaine) + tm_fill()  

The following chunk transforms the world map to a World Robinson projection.

World.robin <- st_transform(World,"+proj=robin +lon_0=0 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs")
tm_shape(World.robin) + tm_fill()  

The following chunk transforms the world map to a World sinusoidal projection.

World.sin <- st_transform(World,"+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs")
tm_shape(World.sin) + tm_fill()  

The following chunk transforms the world map to a World Mercator projection.

World.mercator <- st_transform(World,"+proj=merc +lon_0=0 +k=1 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs")
tm_shape(World.mercator) + tm_fill()  

### Example of a failed transformation

An issue that can come up when transforming spatial data is when the location of the tangent line(s) or points in the CS definition forces polygon features to be split across the 180° meridian. For example, re-centering the mercator projection to -69° will create the following map.

World.mercator2 <- st_transform(World, "+proj=merc +lon_0=-69 +k=1 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs")

tm_shape(World.mercator2) + tm_borders()

The polygons are split and R does not know how to piece them together.

One solution is to make use of maptoolsnowrapSpatialPolygons() function. This function will split the polygon at a given longitude however, it requires that the object be of Spatial* type and that it be in a geographic (lat/long) reference system. The following chunk is a sample workflow.

library(maptools)

# Convert to lat/long reference system
wld.ll <- st_transform(World, "+proj=longlat +datum=WGS84 +no_defs")

# Convert to a spatial object, then split the polygons at a given longitude (111 in this example)
wld.sp <- nowrapSpatialPolygons(as(wld.ll, "Spatial"), offset = 111)

# Now convert back to an sf object, reproject to a new longitude center at -69 degrees
# then plot it
wld.sf <- st_as_sf(wld.sp)
wld.merc2.sf <- st_transform(wld.sf, "+proj=merc +lon_0=-69 +k=1 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs")
tm_shape(wld.merc2.sf) + tm_borders()

You’ll note that the Antarctica polygon is deformed in this transformation. In such a case, you might need to remove the Antarctica polygon before proceeding with the nowrapSpatialPolygons or you can adopt another projection. In the following example, a Robinson projection is used in lieu of a Mercator.

wld.rob.sf <- st_transform(wld.sf,"+proj=robin +lon_0=-69 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs")
tm_shape(wld.rob.sf) + tm_borders()

One downside to the the nowwrapSpatialPolygons solution is the loss of the attribute table.

head(data.frame(wld.rob.sf), 4)
                        geometry
1 MULTIPOLYGON (((11566048 38...
2 MULTIPOLYGON (((8046523 -62...
3 MULTIPOLYGON (((7724367 447...
4 MULTIPOLYGON (((11102301 25...

A (not so perfect) solution is to create a centroid from the polygons prior to projecting them, then performing a spatial join of the transformed points to the transformed polygons.

# Create centroid from polygons
pt <- st_centroid(World, of_largest_polygon = TRUE)

# Transform points to the recentered Robinson projection
pt.rob <- st_transform(pt,"+proj=robin +lon_0=-69 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs")

# Perform the spatial join (joining the point attribute values to the wld.rob.sf polygons)
wld.rob.df.sf <- st_join(wld.rob.sf, pt.rob, join = st_contains)

# Map the output
tm_shape(wld.rob.df.sf) + tm_polygons(col="pop_est_dens", style="quantile") +
tm_legend(outside=TRUE)

While this solution appears to work for most polygons a few, such as Norway, have missing values. To see why, let’s zoom in on Norway using the unprojected layers and overlapping the map with the centroids.

library(dplyr)

# Extract the extent for the Norwar/Sweden region
nor.bb <- World %>% filter(name == "Norway" | name == "Sweden") %>% st_bbox()

# Plot the data zoomed in on the region. Add the point layer for reference
tm_shape(World, bbox=nor.bb) + tm_polygons(col="pop_est_dens", style="quantile") +
tm_shape(pt) + tm_dots() +
tm_text("name", just="left", xmod=0.5, size=0.8) +
tm_legend(outside=TRUE)

You’ll notice that the Norway point falls inside the Sweden polygon. This is a result of the “C” shaped nature of Norway and st_centroid’s use of the geometric center in computing the centroid and not the “center of mass”.

An alternate solution is the use of st_point_on_surface() which will place a point inside the polygons.

pt <- st_point_on_surface(World)

pt.rob <- st_transform(pt,"+proj=robin +lon_0=-69 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs")
wld.rob.df.sf <- st_join(wld.rob.sf, pt.rob, join = st_contains)
tm_shape(wld.rob.df.sf) + tm_polygons(col="pop_est_dens", style="quantile") +
tm_legend(outside=TRUE)

While in theory, a point completely enclosed by a bounded area should always remain bounded by that area in any projection, this is not always the case in practice. This is because the transformation applies to the vertices that define the line segments and not the lines themselves. So if a point is inside of a polygon and very close to one of its boundaries in its native projection, it may find itself on the other side of that line segment in another projection hence outside of that polygon. In the following example, a polygon layer and point layer are created in a Miller coordinate system where the points are enclosed in the polygons.

# Define a dew projections
miller <- "+proj=mill +lat_0=0 +lon_0=0 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs"
lambert <- "+proj=lcc +lat_1=20 +lat_2=60 +lat_0=40 +lon_0=-96 +x_0=0 +y_0=0 +ellps=GRS80 +datum=NAD83 +units=m +no_defs"

# Subset the World data layer
wld.mil <-  World %>% filter( iso_a3  == "CAN" |  iso_a3 == "USA") %>% st_transform( miller)

# Create polygon and point layers in the Miller projection
sf1 <- st_sfc( st_polygon(list(cbind(c(-13340256,-13340256,-6661069, -6661069, -13340256),
c(7713751, 5326023, 5326023,7713751, 7713751 )))), crs = miller)

pt1 <- st_sfc( st_multipoint(rbind(c(-11688500,7633570), c(-11688500,5375780),
c(-10018800,7633570), c(-10018800,5375780),
c(-8348960,7633570), c(-8348960,5375780))),  crs = miller)
pt1 <- st_cast(pt1, "POINT") # Create single part points

# Plot the data layers in their native projection
tm_shape(wld.mil) +tm_fill(col="grey") +
tm_graticules(x = c(-60,-80,-100, -120, -140),
y = c(30,45, 60),
labels.col = "white", col="grey90") +
tm_shape(sf1) + tm_polygons("red", alpha = 0.5, border.col = "yellow") +
tm_shape(pt1) + tm_dots(size=0.2) 

The points are close to the boundaries, but are inside of the polygon nonetheless. To confirm, we can run st_contains on the dataset:

st_contains(sf1, pt1)
Sparse geometry binary predicate list of length 1, where the predicate was contains'
1: 1, 2, 3, 4, 5, 6

All six points are selected, as expected.

Now, let’s reproject the data into a Lambert conformal projection.

# Transform the data
wld.lam <- st_transform(wld.mil, lambert)
pt1.lam <- st_transform(pt1, lambert)
sf1.lam <- st_transform(sf1, lambert)

# Plot the data in the Lambert coordinate system
tm_shape(wld.lam) +tm_fill(col="grey") +
tm_graticules( x = c(-60,-80,-100, -120, -140),
y = c(30,45, 60),
labels.col = "white", col="grey90") +
tm_shape(sf1.lam) + tm_polygons("red", alpha = 0.5, border.col = "yellow") +
tm_shape(pt1.lam) + tm_dots(size=0.2)   

Only three of the points are contained. We can confirm this using the st_contains function:

st_contains(sf1.lam, pt1.lam)
Sparse geometry binary predicate list of length 1, where the predicate was contains'
1: 1, 3, 5

To resolve this problem, one should densify the the polygon by adding more vertices along the line segment. The vertices density will be dictated by the resolution needed to preserve the map’s containment properties and is best determined experimentally.

We’ll use the st_segmentize function to create vertices at 1 km (1000 m) intervals.

# Add vertices every 1000 meters along the polygon's line segments
sf2 <- st_segmentize(sf1, 1000)

# Transform the newly densified polygon layer
sf2.lam <- st_transform(sf2, lambert)

# Plot the data
tm_shape(wld.lam) + tm_fill(col="grey") +
tm_graticules( x = c(-60,-80,-100, -120, -140),
y = c(30,45, 60),
labels.col = "white", col="grey90") +
tm_shape(sf2.lam) + tm_polygons("red", alpha = 0.5, border.col = "yellow") +
tm_shape(pt1.lam) + tm_dots(size=0.2) 

Now all points remain contained by the polygon. We can check via:

st_contains(sf2.lam, pt1.lam)
Sparse geometry binary predicate list of length 1, where the predicate was contains'
1: 1, 2, 3, 4, 5, 6

## Creating Tissot indicatrix circles

Most projections will distort some aspect of a spatial property, especially area and shape. A nice way to visualize the distortion afforded by a projection is to create geodesic circles.

First, create a point layer that will define the circle centers in a lat/long coordinate system.

tissot.pt <- st_sfc( st_multipoint(rbind(c(-60,30), c(-60,45), c(-60,60),
c(-80,30), c(-80,45), c(-80,60),
c(-100,30), c(-100,45), c(-100,60),
c(-120,30), c(-120,45), c(-120,60) )),  crs = "+proj=longlat")
tissot.pt <- st_cast(tissot.pt, "POINT") # Create single part points

Next we’ll construct geodesic circles from these points using the geosphere package.

library(geosphere)

cr.pt <- list() # Create an empty list

# Loop through each point in tissot.pt and generate 360 vertices at 300 km
# from each point in all directions at 1 degree increment. These vertices
# will be used to approximate the Tissot circles
for (i in 1:length(tissot.pt)){
cr.pt[[i]] <- list( destPoint( as(tissot.pt[i], "Spatial"), b=seq(0,360,1), d=300000) )
}

# Create a closed polygon from the previously generated vertices
tissot.sfc <- st_cast( st_sfc(st_multipolygon(cr.pt ),crs = "+proj=longlat"), "POLYGON" )

We’ll check that these are indeed geodesic circles by computing the geodesic area of each polygon. We’ll use the st_area function from sf which will revert to geodesic area calculation if a lat/long coordinate system is present.

tissot.sf <- st_sf( geoArea =  st_area(tissot.sfc), tissot.sfc )

The true area of the circles should be $$\pi * r^2$$ or 2.827433410^{11} square meters in our example. Let’s compute the error in the tissot output. The values will be reported as fractions.

( (pi * 300000^2) -  as.vector(tissot.sf$geoArea) ) / (pi * 300000^2)  [1] 0.0002356458 0.0002350269 0.0002344091 0.0002356458 0.0002350269 [6] 0.0002344091 0.0002356458 0.0002350269 0.0002344091 0.0002356458 [11] 0.0002350269 0.0002344091 In all cases, the error is less than 0.1%. The error is primarily due to the discretization of the circle parameter. Let’s now take a look at the distortions associated with a few popular coordinate systems. We’ll start by exploring the Mercator projection. # Transform geodesic circles and compute area error as a percentage tissot.merc <- st_transform(tissot.sf, "+proj=merc +ellps=WGS84") tissot.merc$area_err <- round((st_area(tissot.merc, tissot.merc$geoArea)) / tissot.merc$geoArea * 100 , 2)

# Plot the map
tm_shape(World, bbox = st_bbox(tissot.merc), projection = st_crs(tissot.merc)) +
tm_borders() +
tm_shape(tissot.merc) + tm_polygons(col="grey", border.col = "red", alpha = 0.3) +
tm_graticules(x = c(-60,-80,-100, -120, -140),
y = c(30,45, 60),
labels.col = "white", col="grey80") +
tm_text("area_err", size=.8, alpha=0.8, col="blue")

The mercator projection does a good job at preserving shape, but the area’s distortion increases dramatically poleward.

Next, we’ll explore the Lambert azimuthal equal area projection centered at 45 degrees north and 100 degrees west.

# Transform geodesic circles and compute area error as a percentage
tissot.laea <- st_transform(tissot.sf, "+proj=laea +lat_0=45 +lon_0=-100 +ellps=WGS84")
tissot.laea$area_err <- round( (st_area(tissot.laea ) - tissot.laea$geoArea) /
tissot.laea$geoArea * 100, 2) # Plot the map tm_shape(World, bbox = st_bbox(tissot.laea), projection = st_crs(tissot.laea)) + tm_borders() + tm_shape(tissot.laea) + tm_polygons(col="grey", border.col = "red", alpha = 0.3) + tm_graticules(x=c(-60,-80,-100, -120, -140), y = c(30,45, 60), labels.col = "white", col="grey80") + tm_text("area_err", size=.8, alpha=0.8, col="blue") The area error across the 48 states is near 0. But note that the shape does become distorted as we move away from the center of projection. Next, we’ll explore the Robinson projection. # Transform geodesic circles and compute area error as a percentage tissot.robin <- st_transform(tissot.sf, "+proj=robin +ellps=WGS84") tissot.robin$area_err <- round(  (st_area(tissot.robin ) - tissot.robin$geoArea) / tissot.robin$geoArea * 100, 2)

# Plot the map
tm_shape(World, bbox = st_bbox(tissot.robin), projection = st_crs(tissot.robin)) +
tm_borders() +
tm_shape(tissot.robin) + tm_polygons(col="grey", border.col = "red", alpha = 0.3) +
tm_graticules(x=c(-60,-80,-100, -120, -140),
y = c(30,45, 60),
labels.col = "white", col="grey80") +
tm_text("area_err", size=.8, alpha=0.8, col="blue")`

Both shape and area are measurably distorted for the north american continent.