1 Introduction

Hadley Wickham’s ggvis package is intended for making interactive plots. It renders plots more quickly than ggplot2, which makes it a good choice for using in Shiny apps, which Mathieu will cover in a future workshop. It also uses the same piping operators %>% as dplyr, so you can chain dplyr and ggvis commands together.

All ggvis graphics are rendered in a web browser, and interactive plots only work if they are connected to an active R session. ggivs is newer than ggplot2, and while it is great at making web-ready interactive plots, it doesn’t yet support a few features, such as side-by-side bar plots and faceting.

Instead of aesthetics (aes), ggvis has properties (props). In most ggvis plots, you do not need to explicitly call prop. This is similar to using qplot from ggplot2.

The syntax is also slightly different than ggplot2. Instead of just using the = operator, ggvis uses the following:

Suppose that variable is a column in myData. If you don’t use ~ in front of variable, ggvis won’t look for variable in myData.

There are other differences, too. For more information, please see the ggvis homepage and the pages to which it links.

2 Examples

We’ll use the 2009 residential energy consumption survey data again. As before, we’ll ignore the parsing warnings for now.

library(readr)
data <- read_csv("http://www.eia.gov/consumption/residential/data/2009/csv/recs2009_public.csv",
  progress=FALSE)
## Warning: 30 parsing failures.
##  row         col               expected actual
## 1538 GALLONFOOTH no trailing characters   .724
## 1538 BTUFOOTH    no trailing characters   .732
## 1538 DOLFOOTH    no trailing characters   .373
## 3628 NUMCORDS    no trailing characters   .5  
## 3981 GALLONFOOTH no trailing characters   .681
## .... ........... ...................... ......
## .See problems(...) for more details.
data$REGIONC <- factor(data$REGIONC, labels=c("Northeast","Midwest","South","West"))

Let’s remake some of the plots from the ggplot2 tutorial with ggvis. As in the ggplot2 tutorial, I’ll explain the new parts of each plot in the workshop as we go.

library(ggvis)
library(dplyr)

vis <- data %>%
  group_by(REGIONC) %>%
  summarize(unweighted = mean(KWH)) %>%
  ggvis(~REGIONC, ~unweighted) %>%
  layer_bars()

vis

Notice that you can resize the plot by dragging the bottom right corner. You can also change the rendering format and download the plot by clicking on the settings button in the top right corner.

Now let’s adjust the axis labels. As far as I know, there is not a way to set the font size globally like in ggplot2 (e.g. theme_bw(18)). The ggvis syntax is flexible, but verbose. To keep the code manageable and readable, you can first assign all the non-formatting aspects to a variable, e.g. vis for visual, and then pipe that variable to the formatting commands. We already created vis, so now we can add the formatting commands.

vis %>%
  add_axis("x", title = "Region", title_offset = 40,
    properties = axis_props(
      title = list(fontSize = 18),
      labels = list(fontSize = 16)
    )
  ) %>%
  add_axis("y", title = "Unweighted mean kWh", title_offset = 70,
    properties = axis_props(
      title = list(fontSize = 18),
      labels = list(fontSize = 16)
    )
  )

We can even chain in the melt function from reshape2. This code is getting a little unruly, but it works.

library(reshape2)

vis <- data %>%
  group_by(REGIONC) %>%
  summarize(
    weighted = weighted.mean(x = KWH, w = NWEIGHT),
    unweighted = mean(KWH)
  ) %>%
  melt(id.vars = "REGIONC") %>% 
  ggvis(~REGIONC, ~value, fill=~variable) %>%
  layer_bars()
  
vis %>%
  add_legend("fill", title = "",
    properties = legend_props(
      labels = list(fontSize = 16)
    )
  ) %>%
  add_axis("x", title = "Region", title_offset = 40,
    properties = axis_props(
      title = list(fontSize = 18),
      labels = list(fontSize = 16)
    )
  ) %>%
  add_axis("y", title = "Unweighted mean kWh", title_offset = 70,
    properties = axis_props(
      title = list(fontSize = 18),
      labels = list(fontSize = 16)
    )
  )

It looks like ggivs does not yet support grouped (side-by-side) bar plots, so for now we’ll have to stick with two bar plots, one for the weighted and one for the unweighted means.

Now let’s remake the smoothed scatter plots. ggvis does not currently support faceting, but we can still make a plot that uses color to group by region. In this case, instead of calling color, we’ll use both fill and stroke. We’ll also call gam from the mgcv package. Note that we don’t need to load the mgcv package first if we use mgcv::gam. This is R’s scope resolution operator. It is part of the R language, and not specific to any package.

vis <- data %>% ggvis(~CDD65, ~KWH, fill=~REGIONC) %>%
  mutate(KWH = KWH^(1/4)) %>%
  layer_points(opacity := 0.075) %>%
  group_by(REGIONC) %>%
  layer_model_predictions(stroke = ~REGIONC, model="mgcv::gam")
  
vis %>%
  add_legend(c("fill", "stroke"), title = "Region",
    properties = legend_props(
      title = list(fontSize = 18),
      labels = list(fontSize = 16)
    )
  ) %>%
  add_axis("x", title = "Cooling days", title_offset = 40,
    values = rep(0:5)*1000,
    properties = axis_props(
      title = list(fontSize = 18),
      labels = list(fontSize = 16)
    )
  ) %>%
  add_axis("y", title = "kWh^(1/4)", title_offset = 40,
    properties = axis_props(
      title = list(fontSize = 18),
      labels = list(fontSize = 16)
    )
  )

We can also use loess smoothing by calling layer_smooths. We’ll make the bandwith (span) interactive. To use the interactive component, you’ll need to run this in your own R session.

vis <- data %>% ggvis(~CDD65, ~KWH, fill=~REGIONC) %>%
  mutate(KWH = KWH^(1/4)) %>%
  layer_points(opacity := 0.075) %>%
  group_by(REGIONC) %>%
  layer_smooths(span = input_slider(0.1, 5, 1, label = "Span"),
    stroke = ~REGIONC,
    se = TRUE)
  
vis %>%
  add_legend(c("fill", "stroke"), title = "Region",
    properties = legend_props(
      title = list(fontSize = 18),
      labels = list(fontSize = 16)
    )
  ) %>%
  add_axis("x", title = "Cooling days", title_offset = 40,
    values = rep(0:5)*1000,
    properties = axis_props(
      title = list(fontSize = 18),
      labels = list(fontSize = 16)
    )
  ) %>%
  add_axis("y", title = "kWh^(1/4)", title_offset = 40,
    properties = axis_props(
      title = list(fontSize = 18),
      labels = list(fontSize = 16)
    )
  )

3 Exercises

  1. Make a histogram of KWH with interactive bin width. You can consult the ggvis cookbook to find histogram examples, and the interactivity page to find interactive controls.
  2. Re-write the code for the bar plots above, splitting the code for vis into two steps: 1) creating a dataset, called dat, and 2) passing dat to ggvis to create vis. This is similar to the steps we took in the previous tutorial.
  3. Add interactive controls to the smoothing plots to control the size of the dots.
  4. Add any other interactive components you want, and experiment with other types of plots.

Computing workshop homepage