Nebraska state and counties boundary map with POI and spatial data display
UPDATED EDIT (Nov 2020)
This post originally utilized the packages sp
(for sp spatial object classes) and tigris
for US states and counties shapefiles.
Recently, an updated version of the package tigris
changed its behavior to import shapefiles as sf
objects (instead of its past behavior of treating shapefiles as sp
objects), which caused the code below to break.
Also, the package tigris
has had issues to load in my machine after updating my macOS to Big Sur.
Therefore, the post below has been revamped to use the package USAboundaries
instead of tigris, and while we’re at it, to use the package sf
and its objects instead of sp.
These changes will make the code more up-to-date with recent package developments, and hopefully avoid code breaks in your sessions too.
Initial notes
If you would like to run this script on your own, you may copy and paste the “code blocks” into your R session. Code blocks are blocks containing both functions and comments. You can copy them by clicking on the “Copy Code” button that appears on the right-hand corner of the code box.
1) Script description
This script was written with the intent of teaching University of Nebraska-Lincoln Agronomy and Horticulture R Club participants how to create high-quality maps for their research.
The specific objectives of this script are to:
Create a map of Nebraska geographic borders, including county borders.
Include in the map points and labels representing a point of interest (e.g. experimental station).
Display spatial quantitative data on the map.
2) Setup
Here we load all necessary packages to be used during this exercise.
All the packages below need to be installed the first time if you haven’t yet.
To install a package, use the function install.packages(“packagename”).
knitr::opts_chunk$set(echo = TRUE, fig.path = "static")
# Loading necessary packages
library(tidyverse)
library(USAboundaries)
library(sf)
library(ggthemes)
3) Creating shapefiles
We need to create three shapefiles: one containing the Nebraska state borders polygon, one containing the Nebraska counties borders polygons, and one containing the study location points.
3.1) Nebraska state shapefile
First, we will use the function states() from tigris package, which contains the boundary of all US states, and save it to an object called states.
# Creating an object with Nebraska
nebraska <- us_states(states = "NE")
Let’s check our object nebraska.
# Checking nebraska class
class(nebraska)
## [1] "sf" "data.frame"
# Checking nebraska data
nebraska
## Simple feature collection with 1 feature and 12 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -104.0532 ymin: 40 xmax: -95.30829 ymax: 43.00077
## Geodetic CRS: WGS 84
## statefp statens affgeoid geoid stusps name lsad aland
## 45 31 01779792 0400000US31 31 NE Nebraska 00 198957373531
## awater state_name state_abbr jurisdiction_type
## 45 1371117957 Nebraska NE state
## geometry
## 45 MULTIPOLYGON (((-104.0531 4...
The object nebraska is of class “sf”, with a default geographic coordinate reference system (crs) of WGS 84.
# Plotting nebraska
nebraska %>%
ggplot()+
geom_sf()
After plotting it, we confirm that it matches what we need.
3.2) Nebraska counties shapefile
Next, let’s create a shapefile containing Nebraska counties boundaries and save it to an object called ne_counties.
ne_counties <- us_counties(states="NE")
Let’s check our object ne_counties.
# Checking ne_counties class
class(ne_counties)
## [1] "sf" "data.frame"
# Plotting ne_counties
ne_counties %>%
ggplot()+
geom_sf()
The object ne_counties is also of class “sf”. After plotting it, we confirm that it also matches what we need.
3.3) Studies location (POI) shapefile
Let’s assume that you have conducted field studies, surveys, or any other type of data collection at different places, and you wish to show these locations in a map.
In order to add points to the map that represent different locations, we will need to find their spatial coordinates, expressed in decimal degrees of longitude and latitude.
You can find their coordinates by locating them on Google maps, right-clicking the location and selecting “What’s here?”. This will prompt a dialog box on the bottom part of the map that includes the coordinates. Copy and past those coordinates into R, and save them to an object. See example below for the locations of Lincoln and North Platte, which are saved to an object called studies.
# Creating a data frame with long and lat coordinates
studies <- data.frame(Location = c("Lincoln", "North Platte"),
long = c(-96.664768,-100.775976),
lat = c(40.835995,41.121288))
#long and lat coordinates were manually obtained via google maps
The object studies should contain three columns: one with the study location names, and the two others containing the site’s long and lat information. Let’s print it to check if it matches what we need.
# Checking our dataset
studies
## Location long lat
## 1 Lincoln -96.66477 40.83599
## 2 North Platte -100.77598 41.12129
It matches, great! Now, let’s check what is the class of the object studies.
# Checking class
class(studies)
## [1] "data.frame"
Note how the object studies is of class “data.frame”. Therefore, at this point, R does not know that columns long and lat actually represent spatial coordinates.
We can fix that by using the function st_as_sf() from package sf, and specifying a coordinate reference system (crs).
# Transforming studies into spatial object
studies <- st_as_sf(studies,
coords = c("long", "lat"),
crs = 4326)
# Checking studies class
class(studies)
## [1] "sf" "data.frame"
Note how now the object studies is of type “sf”, matching the same class type as the shapefiles created in previous steps.
Just to make sure if they make sense, let’s plot it.
studies %>%
ggplot()+
geom_sf()
Although not pretty, the plot makes sense, at least without any other contextual layer.
Now, let’s assume that you have measured some type of data, have summarized it at the county level, and wish to display this data on a map.
I myself don’t have this type of data, so I decided to simulate some fake, random dataset in order to demonstrate how to plot it.
4) Simulating spatial data
Let’s create a fake spatial dataset with a given response variable of interest. I chose to create fake data about cover crop adoption, as a percentage, for each county in Nebraska.
For that, we need to simulate one fake data observation value for each of the counties in Nebraska. Let’s check how many counties there are in Nebraska by finding the number of rows in the object ne_counties.
# Let's check how many counties there are
nrow(ne_counties)
## [1] 93
There are 93 rows in the dataset, which means that we need to simulate 93 data points, one for each county.
In order to do that, I will use the function runif(), and ask it to generate 93 (nrow(ne_counties)) observations varying from 0 to 100, and save this to an object called cc_pct.
# Creating 93 fake percentages of cover crop adoption
cc_pct <- runif(nrow(ne_counties), 0, 100)
# Print first observations of cc_pct
cc_pct %>%
head()
## [1] 55.89592 36.56441 89.73623 59.88501 30.37012 27.02541
We see how the object cc_pct is a vector with 93 observations, meaning that by itself, this object does not contain any other information needed to tell where it is in space. The spatial information at the county level is found in the object ne_counties.
Therefore, we need to save the results of cc_pct as a new column in the data of ne_counties.
# Assigning these values to a colum in the counties df
ne_counties$CC_pct <- cc_pct
# Printing ne_counties first observations
ne_counties %>%
head()
## Simple feature collection with 6 features and 16 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -102.0678 ymin: 40.00157 xmax: -97.36812 ymax: 42.09623
## Geodetic CRS: WGS 84
## statefp countyfp countyns affgeoid geoid name namelsad
## 19 31 079 00835861 0500000US31079 31079 Hall Hall County
## 30 31 075 00835859 0500000US31075 31075 Grant Grant County
## 47 31 143 00835893 0500000US31143 31143 Polk Polk County
## 49 31 035 00835839 0500000US31035 31035 Clay Clay County
## 100 31 145 00835894 0500000US31145 31145 Red Willow Red Willow County
## 111 31 101 00835872 0500000US31101 31101 Keith Keith County
## stusps state_name lsad aland awater state_name.1 state_abbr
## 19 NE Nebraska 06 1415143418 14903733 Nebraska NE
## 30 NE Nebraska 06 2012381803 16500663 Nebraska NE
## 47 NE Nebraska 06 1136152511 5104692 Nebraska NE
## 49 NE Nebraska 06 1482222881 3234877 Nebraska NE
## 100 NE Nebraska 06 1857036304 2591016 Nebraska NE
## 111 NE Nebraska 06 2749903240 124629167 Nebraska NE
## jurisdiction_type geometry CC_pct
## 19 state MULTIPOLYGON (((-98.72198 4... 55.89592
## 30 state MULTIPOLYGON (((-102.0666 4... 36.56441
## 47 state MULTIPOLYGON (((-97.82826 4... 89.73623
## 49 state MULTIPOLYGON (((-98.2781 40... 59.88501
## 100 state MULTIPOLYGON (((-100.7584 4... 30.37012
## 111 state MULTIPOLYGON (((-102.0556 4... 27.02541
The last column of the object ne_counties dataset now contains information on the cover crop adoption (CC_pct). Perfect!
We are ready now to put it all together in a plot.
5) Plotting it all
As we’ve seen on previous plots, sf objects are also data.frame objects and because of that they fit right into ggplot calls, which is not the case for sp objects. This is perhaps one of the main reasons to switch to sf!
ggplot()+
# Adding nebraska state outline
geom_sf(data=nebraska,
fill=NA,
color="black",
size=.8)+
# Adding nebraska counties outline
geom_sf(data=ne_counties,
color="black",
size=.4,
aes(fill=CC_pct))+
# Adding studies locations
geom_sf(data=studies,
size=2, shape=21,
color="black",
fill="blue",
show.legend = F)+
# Changing counties fill colors
scale_fill_gradient(low="red", high = "green")+
# Adding studies name
geom_sf_label(data=studies,
aes(label=Location),
hjust=-0.15,
vjust=0,
size=3,
label.padding = unit(0.1, "lines"))+
# Specifying map coordinate style and theme
coord_sf()+
theme_map()+
# Changing fill and plot title labels
labs(fill="Cover Crop \nAdoption (%)",
title="Cover Crop Adoption for each County in Nebraska \n (MADE-UP DATA!)")+
# Final adjustments on theme
theme(legend.position = c(.01,-.16),
legend.direction = "horizontal",
plot.title = element_text(size=15, hjust = 0.5))
That’s it! Now you are ready to apply what you have learned to your own datasets and wow your audience at your next conference talk!
6) Summary
This script demonstrated how to:
- Import state shapefile for a specific US state
- Import a state counties shapefile
- Find locations’ spatial coordinates and save them to an object
- Simulate fake data and append it to a spatial object
- Create publication-ready, stunning maps with all of these data
I hope you have enjoyed this post!
Please let me know in the comments below if you have any questions about this script, any suggestions to improve it, and any suggestions for future posts.
Happy coding!