Blogging with RStudio - an example with the 2020 KSUCrops team data
1) Script description
Hey there.
The exercise below is intended as a demonstration on how to use RStudio to create blogs and blog posts.
In fact, though we will be using RStudio as our platform, behind the scenes we are also using the following software:
blogdown package, which is an R API for implementing Hugo
Netlify, which is the host domain I chose to use (free!)
I wanted to thank Rachel Veenstra for gathering the data we will be using. Thanks Rachel!
The intended audience is the 2020 KSUCrops team members. Thus, I decided to use and explore our team’s demographics data.
The specific objectives of this script are to:
Import a dataset from github.
Do some quick data wrangling before each plot.
Create plots (using the ggplot2 package) showing how our team can be categorized by gender, title, and country of origin.
Upload the new post and build the blog.
2) Setup
Here we load all necessary packages to be used during this exercise.
The packages below need to be installed the first time if you haven’t yet.
To install a package, use the function install.packages("packagename")
.
2.1) Packages description
Package RCurl is needed in order to import the dataset that we will work with from my personal github account.
Package tidyverse actually contains a group of packages that follow the same syntax standard. It includes the package ggplot2, which is what we will be using the most for this exercise.
# Setting global chunk options
knitr::opts_chunk$set(echo = TRUE, tibble.width=Inf,
fig.path = "static")
# Loading necessary packages
library(RCurl)
library(tidyverse)
library(ggthemes)
2.2) mytheme
I will be using the same plot style for all plots. Thus, it is easier to create this style once, and just reuse it.
Below, I am creating the object mytheme
, which contains my plotting style choices.
mytheme <- theme_bw()+
theme(legend.position = "none",
axis.title.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
text = element_text(size = 20))
2.3) Data import
Let’s import the dataset.
This dataset is saved on my personal github account, so we are going to use the getURL()
function to save its path into the df_path object.
After that point, we just use the read_csv()
function as we would normally.
Here, we are saving the data into an object called team.
# Saving the URL path into the object df_path
df_path<- getURL("https://raw.githubusercontent.com/leombastos/datasets/master/KSUCropsTeam.csv")
# Reading the data file and saving it to an object called corn
team<- read_csv(df_path)
## Rows: 24 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (4): Name, Title, Country, Gender
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
The dataset has been imported.
3) Data exploration
Now, let’s explore it a bit to understand its current structure.
# Printing team
team
## # A tibble: 24 × 4
## Name Title Country Gender
## <chr> <chr> <chr> <chr>
## 1 Adhemar Visiting Scholar Brazil Male
## 2 Mario N. Visiting Scholar Brazil Male
## 3 Rafael Visiting Scholar Brazil Male
## 4 Lucas Visiting Scholar Brazil Male
## 5 Luiz Felipe Visiting Scholar Brazil Male
## 6 Juan Visiting Scholar Argentina Male
## 7 Axel Visiting Scholar Argentina Male
## 8 Valentina Research Scholar Argentina Female
## 9 Leticia Research Scholar Brazil Female
## 10 Constanza Research Scholar Argentina Female
## # … with 14 more rows
This dataset has the columns Name, Title, Country, and Gender for each team member.
Now, let’s create some plots!
4) Plotting demographics
4.1) Gender
Let’s first do a plot on how our team can be categorized by Gender.
# Wrangling
team %>%
group_by(Gender) %>%
summarise(N=length(Name)) %>%
# Plotting
ggplot(aes(x=Gender, y=N, fill=Gender))+
geom_bar(stat="identity", color="black")+
geom_text(aes(y=N-.5, label=N), size=6)+
mytheme
Over all 24 team members, 8 are female and 16 are male.
4.2) Title
Now let’s do a plot on how our team can be categorized by Title
# Wrangling
team %>%
group_by(Title) %>%
summarise(N=length(Name)) %>%
# Plotting
ggplot(aes(x=reorder(Title, desc(N)), y=N, fill=Title))+
geom_bar(stat="identity", color="black")+
geom_text(aes(y=N-.5, label=N),size=6)+
labs(x="Title")+
mytheme+
theme(axis.text.x = element_text(angle=45, hjust=1),
plot.margin = unit(c(.2,.2,.2,1), "cm"))
The largest Title categories are M.S. Student and Visiting Scholar, with 7 members each.
Cool!
4.3) Country
Finally, let’s do a plot on how our team can be categorized by Country.
# Wrangling
team %>%
group_by(Country) %>%
summarise(N=length(Name)) %>%
mutate(Code=c("ar", "br", "us")) %>%
# Plotting
ggplot(aes(x=reorder(Country, desc(N)), y=N, fill=Country))+
geom_bar(stat="identity", color="black")+
#geom_flag(y = -1.5, aes(country = Code), size = 14)+
geom_text(aes(y=N-.5, label=N), size=6)+
labs(x="")+
scale_fill_manual(values=c("dodgerblue1","green4","red2"))+
mytheme+
theme(plot.margin = unit(c(0.2,.2,1,.2), "cm"))
That looks cool! Over all 24 team members, 11 are from Argentina, 10 from Brazil, and 3 from the U.S.
Now, let’s upload this post!
5) Resources
A quick google search will give you a lot of resources for using blogdown, RStudio, Hugo and its different themes, including the Academic.
Here are a couple of websites that can help you get started:
blogdown: Creating Websites with R Markdown: This is the official guide/documentation/tutorial/etc. of the blogdown package.
Setting up your blog with RStudio and blogdown: This is a series of three blog posts that will help you create your blog from scratch, create a post, and modify the themes.
How to start a data blog with R, Hugo, and Blogdown in 10 short steps: another great tutorial on how to get started.
6) Summary
This script demonstrated how to:
- Import a dataset from github
- Use the combo
group_by()/summarise()
to extract number of observations within different groups.
- Use the package ggplot2 package to create bar plots.
- Use RStudio to create and, in my case, Netlify to upload the post and build the blog.
I hope you have enjoyed this post!
Please let me know in the comments below if you have any questions about this script, any suggestions to improve it, and any suggestions for future posts.
Happy blogging!