Nebraska state and counties boundary map with POI and spatial data display

UPDATED EDIT (Nov 2020)

This post originally utilized the packages sp (for sp spatial object classes) and tigris for US states and counties shapefiles.

Recently, an updated version of the package tigris changed its behavior to import shapefiles as sf objects (instead of its past behavior of treating shapefiles as sp objects), which caused the code below to break.

Also, the package tigris has had issues to load in my machine after updating my macOS to Big Sur.

Therefore, the post below has been revamped to use the package USAboundaries instead of tigris, and while we’re at it, to use the package sf and its objects instead of sp.

These changes will make the code more up-to-date with recent package developments, and hopefully avoid code breaks in your sessions too.

Initial notes

If you would like to run this script on your own, you may copy and paste the “code blocks” into your R session. Code blocks are blocks containing both functions and comments. You can copy them by clicking on the “Copy Code” button that appears on the right-hand corner of the code box.

1) Script description

This script was written with the intent of teaching University of Nebraska-Lincoln Agronomy and Horticulture R Club participants how to create high-quality maps for their research.

The specific objectives of this script are to:

  1. Create a map of Nebraska geographic borders, including county borders.

  2. Include in the map points and labels representing a point of interest (e.g. experimental station).

  3. Display spatial quantitative data on the map.

2) Setup

Here we load all necessary packages to be used during this exercise.

All the packages below need to be installed the first time if you haven’t yet.

To install a package, use the function install.packages(“packagename”).

knitr::opts_chunk$set(echo = TRUE, fig.path = "static")

# Loading necessary packages
library(tidyverse)
library(USAboundaries)
library(sf)
library(ggthemes)

3) Creating shapefiles

We need to create three shapefiles: one containing the Nebraska state borders polygon, one containing the Nebraska counties borders polygons, and one containing the study location points.

3.1) Nebraska state shapefile

First, we will use the function states() from tigris package, which contains the boundary of all US states, and save it to an object called states.

# Creating an object with Nebraska
nebraska <- us_states(states = "NE")

Let’s check our object nebraska.

# Checking nebraska class
class(nebraska)
## [1] "sf"         "data.frame"
# Checking nebraska data 
nebraska
## Simple feature collection with 1 feature and 12 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -104.0532 ymin: 40 xmax: -95.30829 ymax: 43.00077
## Geodetic CRS:  WGS 84
##    statefp  statens    affgeoid geoid stusps     name lsad        aland
## 45      31 01779792 0400000US31    31     NE Nebraska   00 198957373531
##        awater state_name state_abbr jurisdiction_type
## 45 1371117957   Nebraska         NE             state
##                          geometry
## 45 MULTIPOLYGON (((-104.0531 4...

The object nebraska is of class “sf”, with a default geographic coordinate reference system (crs) of WGS 84.

# Plotting nebraska
nebraska %>%
  ggplot()+
  geom_sf()

After plotting it, we confirm that it matches what we need.

3.2) Nebraska counties shapefile

Next, let’s create a shapefile containing Nebraska counties boundaries and save it to an object called ne_counties.

ne_counties <- us_counties(states="NE")

Let’s check our object ne_counties.

# Checking ne_counties class
class(ne_counties)
## [1] "sf"         "data.frame"
# Plotting ne_counties
ne_counties %>%
  ggplot()+
  geom_sf()

The object ne_counties is also of class “sf”. After plotting it, we confirm that it also matches what we need.

3.3) Studies location (POI) shapefile

Let’s assume that you have conducted field studies, surveys, or any other type of data collection at different places, and you wish to show these locations in a map.

In order to add points to the map that represent different locations, we will need to find their spatial coordinates, expressed in decimal degrees of longitude and latitude.

You can find their coordinates by locating them on Google maps, right-clicking the location and selecting “What’s here?”. This will prompt a dialog box on the bottom part of the map that includes the coordinates. Copy and past those coordinates into R, and save them to an object. See example below for the locations of Lincoln and North Platte, which are saved to an object called studies.

# Creating a data frame with long and lat coordinates
studies <- data.frame(Location = c("Lincoln", "North Platte"),
                    long = c(-96.664768,-100.775976), 
                    lat = c(40.835995,41.121288)) 
#long and lat coordinates were manually obtained via google maps

The object studies should contain three columns: one with the study location names, and the two others containing the site’s long and lat information. Let’s print it to check if it matches what we need.

# Checking our dataset
studies
##       Location       long      lat
## 1      Lincoln  -96.66477 40.83599
## 2 North Platte -100.77598 41.12129

It matches, great! Now, let’s check what is the class of the object studies.

# Checking class
class(studies)
## [1] "data.frame"

Note how the object studies is of class “data.frame”. Therefore, at this point, R does not know that columns long and lat actually represent spatial coordinates.

We can fix that by using the function st_as_sf() from package sf, and specifying a coordinate reference system (crs).

# Transforming studies into spatial object
studies <- st_as_sf(studies, 
                    coords = c("long", "lat"), 
                    crs = 4326)

# Checking studies class
class(studies)
## [1] "sf"         "data.frame"

Note how now the object studies is of type “sf”, matching the same class type as the shapefiles created in previous steps.

Just to make sure if they make sense, let’s plot it.

studies %>%
  ggplot()+
  geom_sf()

Although not pretty, the plot makes sense, at least without any other contextual layer.

Now, let’s assume that you have measured some type of data, have summarized it at the county level, and wish to display this data on a map.

I myself don’t have this type of data, so I decided to simulate some fake, random dataset in order to demonstrate how to plot it.

4) Simulating spatial data

Let’s create a fake spatial dataset with a given response variable of interest. I chose to create fake data about cover crop adoption, as a percentage, for each county in Nebraska.

For that, we need to simulate one fake data observation value for each of the counties in Nebraska. Let’s check how many counties there are in Nebraska by finding the number of rows in the object ne_counties.

# Let's check how many counties there are
nrow(ne_counties)
## [1] 93

There are 93 rows in the dataset, which means that we need to simulate 93 data points, one for each county.

In order to do that, I will use the function runif(), and ask it to generate 93 (nrow(ne_counties)) observations varying from 0 to 100, and save this to an object called cc_pct.

# Creating 93 fake percentages of cover crop adoption
cc_pct <- runif(nrow(ne_counties), 0, 100)

# Print first observations of cc_pct
cc_pct %>% 
  head()
## [1] 55.89592 36.56441 89.73623 59.88501 30.37012 27.02541

We see how the object cc_pct is a vector with 93 observations, meaning that by itself, this object does not contain any other information needed to tell where it is in space. The spatial information at the county level is found in the object ne_counties.

Therefore, we need to save the results of cc_pct as a new column in the data of ne_counties.

# Assigning these values to a colum in the counties df
ne_counties$CC_pct <- cc_pct

# Printing ne_counties first observations
ne_counties %>% 
  head()
## Simple feature collection with 6 features and 16 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -102.0678 ymin: 40.00157 xmax: -97.36812 ymax: 42.09623
## Geodetic CRS:  WGS 84
##     statefp countyfp countyns       affgeoid geoid       name          namelsad
## 19       31      079 00835861 0500000US31079 31079       Hall       Hall County
## 30       31      075 00835859 0500000US31075 31075      Grant      Grant County
## 47       31      143 00835893 0500000US31143 31143       Polk       Polk County
## 49       31      035 00835839 0500000US31035 31035       Clay       Clay County
## 100      31      145 00835894 0500000US31145 31145 Red Willow Red Willow County
## 111      31      101 00835872 0500000US31101 31101      Keith      Keith County
##     stusps state_name lsad      aland    awater state_name.1 state_abbr
## 19      NE   Nebraska   06 1415143418  14903733     Nebraska         NE
## 30      NE   Nebraska   06 2012381803  16500663     Nebraska         NE
## 47      NE   Nebraska   06 1136152511   5104692     Nebraska         NE
## 49      NE   Nebraska   06 1482222881   3234877     Nebraska         NE
## 100     NE   Nebraska   06 1857036304   2591016     Nebraska         NE
## 111     NE   Nebraska   06 2749903240 124629167     Nebraska         NE
##     jurisdiction_type                       geometry   CC_pct
## 19              state MULTIPOLYGON (((-98.72198 4... 55.89592
## 30              state MULTIPOLYGON (((-102.0666 4... 36.56441
## 47              state MULTIPOLYGON (((-97.82826 4... 89.73623
## 49              state MULTIPOLYGON (((-98.2781 40... 59.88501
## 100             state MULTIPOLYGON (((-100.7584 4... 30.37012
## 111             state MULTIPOLYGON (((-102.0556 4... 27.02541

The last column of the object ne_counties dataset now contains information on the cover crop adoption (CC_pct). Perfect!

We are ready now to put it all together in a plot.

5) Plotting it all

As we’ve seen on previous plots, sf objects are also data.frame objects and because of that they fit right into ggplot calls, which is not the case for sp objects. This is perhaps one of the main reasons to switch to sf!

ggplot()+
  # Adding nebraska state outline
  geom_sf(data=nebraska, 
          fill=NA, 
          color="black", 
          size=.8)+
  # Adding nebraska counties outline
  geom_sf(data=ne_counties, 
          color="black", 
          size=.4, 
          aes(fill=CC_pct))+
  # Adding studies locations
  geom_sf(data=studies, 
          size=2, shape=21, 
          color="black", 
          fill="blue",
          show.legend = F)+
  # Changing counties fill colors
  scale_fill_gradient(low="red", high = "green")+
  # Adding studies name
  geom_sf_label(data=studies, 
                aes(label=Location),
                hjust=-0.15, 
                vjust=0, 
                size=3, 
                label.padding = unit(0.1, "lines"))+
  # Specifying map coordinate style and theme
  coord_sf()+
  theme_map()+
  # Changing fill and plot title labels
  labs(fill="Cover Crop \nAdoption (%)", 
       title="Cover Crop Adoption for each County in Nebraska \n (MADE-UP DATA!)")+
  # Final adjustments on theme
  theme(legend.position = c(.01,-.16),
        legend.direction = "horizontal",
        plot.title = element_text(size=15, hjust = 0.5))

That’s it! Now you are ready to apply what you have learned to your own datasets and wow your audience at your next conference talk!

6) Summary

This script demonstrated how to:

  • Import state shapefile for a specific US state
  • Import a state counties shapefile
  • Find locations’ spatial coordinates and save them to an object
  • Simulate fake data and append it to a spatial object
  • Create publication-ready, stunning maps with all of these data

I hope you have enjoyed this post!
Please let me know in the comments below if you have any questions about this script, any suggestions to improve it, and any suggestions for future posts.

Happy coding!

Leonardo M. Bastos
Leonardo M. Bastos
Assistant Professor, Integrative Precision Agriculture
comments powered by Disqus