Wrapping Text for Prettier Plots

Author’s Note: I wrote this in hysterics after a long week. Excuse the writing style.

Data are dirty. Data are like those kids in middle school who never changed their clothes after PE. Data are like those guys who wear the same cargo shorts every day because they’re comfortable. Data are like those people who refuse to wash their car until their windows are so obscurred it becomes a driving hazard.

A dirty dataset will ruin your day. Have you ever exported a ton of visualizations, only to find they’re completely useless?

Yeah. Like that. I have no idea what I’m looking at, and renaming an arbitrary number of factors is tedious and time consuming.

Fortunately, my computer specializes in tasks of the tedious and time consuming variety. I took some time to teach it some new tricks and got this:

This is much easier to look at. It’s not quite publication material, but it is legible!

This didn’t take very long to write, and it saves me hours of work I wasn’t otherwise doing, but told myself I should. Now instead of blindly running tables and summaries, I can blindly run effective visualizations too!

The two functions to tackle this problem are super easy too. It saves time when examining factors with long labels. Before I individually replaced them and that was 50 shades of awful.

Anyway, this is a great example of how writing functions will aid your workflow. This is an excerpt from my utils library, nevR.

#' Wrap text with newline.
#' @param x character vector.
#' @param width number of spaces before wrapping.
#' @return character vector with individual elements wrapped.
#' @details Return atomic vector if input is atomic. Else return list.
wrap_text <- function(x, width = 19)
  if (is.factor(x))
    stop(paste("You haven't implementend this functionality yet.",
               "MAYBE YOU SHOULD DO THAT NOW."))
  wrapped <- lapply(x, function(char) strwrap(char, width))
  wrapped <- lapply(wrapped, paste, collapse = '\n')
  if (is.atomic(x))
    wrapped <- unlist(wrapped)

#' Wrap text with newline throughout data.frame.
#' @description Wrap all character text in data.frame
#' @param df data.frame
#' @param width number of spaces before wrapping.
#' @return character vector with individual elements wrapped.
wrap_df_text <- function(df, width = 19)
  new_df <- df
  for (nm in names(df))
    if (is.character(df[[nm]]))
      df[[nm]] <- wrap_text(df[[nm]], width)
    else if (is.factor(df[[nm]])) {
      df[[nm]] <- factor(df[[nm]], labels = wrap_text(levels(df[[nm]]), width))
  names(df) <- wrap_text(names(df), width)

And here’s the use case which generated the above plot:

long_words <- data.frame(Sentence = c(
  rep('Man, I really love writing long factor levels.', 2), 
  rep('Yeah, whoever thinks data should be easy to read is stupid.', 1))
plot_data <- long_words %>% 
  group_by(Sentence) %>% 
  summarize(SentenceN = n())
wrap_df_text(plot_data) %>% 
  ggplot() +
  geom_bar(aes(x = Sentence, y = SentenceN, fill = Sentence), stat = 'identity') +
  labs(title = 'A Very Frustrating Plot', x = 'Sentence', y = 'Total Occurences')

Want to contribute to our members' blog?

Message Nicholas Vogt at msudatascience@gmail.com