Creating Custom Dotplots with ggplot2: A Step-by-Step Guide to Displaying Quartiles by Gender

Creating a Dotplot with ggplot2 to Display Quartiles for Each Person Broken Down by Gender

In this article, we’ll explore how to create a dotplot using ggplot2 in R that displays quartiles for each person broken down by gender. We’ll break down the steps required to achieve this and provide examples along the way.

Background: Understanding ggplot2 and Dotplots

ggplot2 is a popular data visualization library in R that provides a grammar of graphics. It allows you to create complex visualizations using a modular approach, where each layer is added separately. A dotplot is a type of plot used to display the distribution of a variable.

The Problem: Creating a Custom Dotplot

We have a dataset with information on gender and quartiles for males. We want to create a dotplot that shows four quartiles (lower, lower middle, upper middle, and top) as separate dots, one for each person broken down by gender.

Here’s an example of how our data might look like:

QuartileGender
LowerQuartileMale
LowerMiddleQuartileMale
UpperMiddleQuartileMale
TopQuartileMale

The Solution: Using uncount and Custom Geometry

We can solve this problem by using the uncount function in ggplot2 to create separate rows for each dot. Then, we can use custom geometry to draw a square shape instead of the default circle.

Here’s an example code snippet that achieves this:

a %>% uncount(value) %>%
  group_by(quartile) %>%
  mutate(row = (row_number() -1)%/% 10 + 1,
         col = (row_number() -1) %% 10 + 1) %>%
  ggplot() + 
  aes(col, row, color=gender) + 
  geom_point(shape=15) + # square shape
  facet_grid(~quartile) + # add facet for quartiles
  coord_equal() + 
  theme(axis.ticks.x=element_blank(), axis.ticks.y=element_blank(),
        axis.text.x=element_blank(), axis.text.y=element_blank(),
        axis.title.x=element_blank(), axis.title.y=element_blank())

Custom Geometry with geom_polygon

Another approach to create a dotplot is by using the geom_polygon layer. This allows us to define our own geometry using the aes() function.

Here’s an example code snippet that uses geom_polygon:

a %>% uncount(value) %>%
  group_by(quartile) %>%
  mutate(row = (row_number() -1)%/% 10 + 1,
         col = (row_number() -1) %% 10 + 1) %>%
  ggplot() + 
  aes(col, row, fill=gender) + # use fill aesthetic for color
  geom_polygon(data=a, aes(x=-col/2, y=row-0.5, group=group), 
               geom_path=FALSE) + # disable path lines
  coord_equal() + 
  theme(axis.ticks.x=element_blank(), axis.ticks.y=element_blank(),
        axis.text.x=element_blank(), axis.text.y=element_blank(),
        axis.title.x=element_blank(), axis.title.y=element_blank())

Example Use Case: Visualizing Quartiles for Each Person

Let’s use our example data to create a dotplot that displays quartiles for each person broken down by gender.

# Load required libraries
library(ggplot2)

# Create sample dataset
set.seed(123)
dat <- data.frame(
  EmployerName = paste0("Male ", rep(c("Zellis", "Other"), each=10)),
  gender = c(rep("Male", 20), rep("Female", 20))
)
dat$quartile <- ifelse(dat$EmployerName %in% c("Male Zellis", "Male Other"), 
                      paste0("LowerQuartile ", runif(10, min=40, max=60)),
                      paste0("MiddleQuartile ", runif(10, min=40, max=60)))
dat <- dat %>%
  filter(str_detect(EmployerName,'ZELLIS')) %>%
  select(matches("\\bMale\\w+le", perl=TRUE)) %>%
  pivot_longer(everything()) %>%
  extract(name, c('gender', 'quartile'), '(\\bMale)(\\w+\\b)') %>%
  mutate(men=round(value), women = 100 - men) %>%
  select(-c(gender, value))

# Create dotplot
ggplot(dat, aes(x=col/2, y=row-0.5)) + 
  geom_point(shape=15, color="black") +
  facet_grid(~quartile) + 
  coord_equal() + 
  theme(axis.ticks.x=element_blank(), axis.ticks.y=element_blank(),
        axis.text.x=element_blank(), axis.text.y=element_blank(),
        axis.title.x=element_blank(), axis.title.y=element_blank()) +
  labs(title="Dotplot of Quartiles by Gender")

Conclusion

In this article, we’ve demonstrated how to create a custom dotplot that displays quartiles for each person broken down by gender using ggplot2. We used the uncount function and custom geometry to achieve this. Additionally, we explored alternative approaches using geom_polygon.


Last modified on 2023-08-27