r-statistics.co by Selva Prabhakaran


ggplot2 Quickref

Basics tasks

Legend

Plot text and annotation

Multiple plots

Geom layers

Not finding what you were looking for? Let me know!

Basic tasks

Basic plot setup

gg <- ggplot(df, aes(x=xcol, y=ycol)) 

df must be a dataframe that contains all information to make the ggplot. Plot will show up only after adding the geom layers.

Scatterplot

library(ggplot2)
gg <- ggplot(diamonds, aes(x=carat, y=price)) 
gg + geom_point()

ggplot2 cheatsheet example plot 1

Static - point size, shape, color and boundary thickness

gg + geom_point(size=1, shape=1, color="steelblue", stroke=2)  # 'stroke' controls the thickness of point boundary

ggplot2 cheatsheet example plot 2

Dynamic - point size, shape, color and boundary thickness

Make the aesthetics vary based on a variable in df.

gg + geom_point(aes(size=carat, shape=cut, color=color, stroke=carat))  # carat, cut and color are variables in `diamonds`       

ggplot2 cheatsheet example plot 3

Add Title, X and Y axis labels

gg1 <- gg + geom_point(aes(color=color))
gg2 <- gg1 + labs(title="Diamonds", x="Carat", y="Price")  # ggtitle("title") also changes the title.
print(gg2)

ggplot2 cheatsheet example plot 4

Change color of all text

gg2 + theme(text=element_text(color="blue"))  # all text turns blue.

ggplot2 cheatsheet example plot 5

Change title, X and Y axis label and text size

plot.title: Controls plot title. axis.title.x: Controls X axis title axis.title.y: Controls Y axis title axis.text.x: Controls X axis text axis.text.y: Controls y axis text

gg3 <- gg2 + theme(plot.title=element_text(size=25), axis.title.x=element_text(size=20), axis.title.y=element_text(size=20), axis.text.x=element_text(size=15), axis.text.y=element_text(size=15))
print(gg3)

ggplot2 cheatsheet example plot 6

Change title face, color, line height

gg3 + labs(title="Plot Title\nSecond Line of Plot Title") + theme(plot.title=element_text(face="bold", color="steelblue", lineheight=1.2))

ggplot2 cheatsheet example plot 7

Change point color

gg3 + scale_colour_manual(name='Legend', values=c('D'='grey', 'E'='red', 'F'='blue', 'G'='yellow', 'H'='black', 'I'='green', 'J'='firebrick'))

ggplot2 cheatsheet example plot 8

Adjust X and Y axis limits

Method 1: Zoom in

gg3 + coord_cartesian(xlim=c(0,3), ylim=c(0, 5000)) + geom_smooth()  # zoom in

ggplot2 cheatsheet example plot 9

Method 2: Deletes the points outside limits

gg3 + xlim(c(0,3)) + ylim(c(0, 5000)) + geom_smooth()  # deletes the points 
#> Warning messages:
#> 1: Removed 14714 rows containing non-finite values (stat_smooth). 
#> 2: Removed 14714 rows containing missing values (geom_point). 

Method 3: Deletes the points outside limits

gg3 + scale_x_continuous(limits=c(0,3)) + scale_y_continuous(limits=c(0, 5000)) + geom_smooth()  # deletes the points outside limits
#> Warning message:
#> Removed 14714 rows containing missing values (geom_point). 

ggplot2 cheatsheet example plot 11

Notice the change in smoothing line because of deleted points. This could sometimes be misleading in your analysis.

Change X and Y axis labels

gg3 + scale_x_continuous(labels=c("zero", "one", "two", "three", "four", "five")) + scale_y_continuous(breaks=seq(0, 20000, 4000))  # if Y is continuous, if X is a factor

Use scale_x_discrete instead, if X variable is a factor.

ggplot2 cheatsheet example plot 12

Rotate axis text

gg3 + theme(axis.text.x=element_text(angle=45), axis.text.y=element_text(angle=45))

ggplot2 cheatsheet example plot 13

Flip X and Y Axis

gg3 + coord_flip()  # flips X and Y axis.

ggplot2 cheatsheet example plot 14

Grid lines and panel background

gg3 + theme(panel.background = element_rect(fill = 'springgreen'),
  panel.grid.major = element_line(colour = "firebrick", size=3),
  panel.grid.minor = element_line(colour = "blue", size=1))

ggplot2 cheatsheet example plot 36

Plot margin and background

gg3 + theme(plot.background=element_rect(fill="yellowgreen"), plot.margin = unit(c(2, 4, 1, 3), "cm")) # top, right, bottom, left

ggplot2 cheatsheet example plot 37

Colors

The whole list of colors are displayed at your R console in the color() function. Here are few of my suggestions for nice looking colors and backgrounds:

  • steelblue (points and lines)
  • firebrick (point and lines)
  • springgreen (fills)
  • violetred (fills)
  • tomato (fills)
  • skyblue (bg)
  • sienna (points, lines)
  • slateblue (fills)
  • seagreen (points, lines, fills)
  • sandybrown (fills)
  • salmon (fills)
  • saddlebrown (lines)
  • royalblue (fills)
  • orangered (point, lines, fills)
  • olivedrab (points, lines, fills)
  • midnightblue (lines)
  • mediumvioletred (points, lines, fills)
  • maroon (points, lines, fills)
  • limegreen (fills)
  • lawngreen (fills)
  • forestgreen (lines, fills)
  • dodgerblue (fills, bg)
  • dimgray (grids, secondary bg)
  • deeppink (fills)
  • darkred (lines, points)

If you are looking for consistent colors, the RColorBrewer package has predefined color palettes

Gg Cs Sp37

Legend

Hide legend

gg3 + theme(legend.position="none")  # hides the legend

Change legend title

gg3 + scale_color_discrete(name="")  # Remove legend title (method1)
p1 <- gg3 + theme(legend.title=element_blank())  # Remove legend title (method)
p2 <- gg3 + scale_color_discrete(name="Diamonds")  # Change legend title
library(gridExtra)
grid.arrange(p1, p2, ncol=2)  # arrange

ggplot2 cheatsheet example plot 15

Change legend and point color

gg3 + scale_colour_manual(name='Legend', values=c('D'='grey', 'E'='red', 'F'='blue', 'G'='yellow', 'H'='black', 'I'='green', 'J'='firebrick'))

ggplot2 cheatsheet example plot 16

Change legend position

Outside plot

p1 <- gg3 + theme(legend.position="top")  # top / bottom / left / right

Inside plot

p2 <- gg3 + theme(legend.justification=c(1,0), legend.position=c(1,0))  # legend justification is the anchor point on the legend, considering the bottom left of legend as (0,0)
gridExtra::grid.arrange(p1, p2, ncol=2)

ggplot2 cheatsheet example plot 17

Change order of legend items

df$newLegendColumn <- factor(df$legendcolumn, levels=c(new_order_of_legend_items), ordered = TRUE) 

Create a new factor variable used in the legend, ordered as you need. Then use this variable instead in the plot.

Legend title, text, box, symbol

  • legend.title - Change legend title
  • legend.text - Change legend text
  • legend.key - Change legend box
  • guides - Change legend symbols
gg3 + theme(legend.title = element_text(size=20, color = "firebrick"), legend.text = element_text(size=15), legend.key=element_rect(fill='steelblue')) + guides(colour = guide_legend(override.aes = list(size=2, shape=4, stroke=2)))  # legend title color and size, box color, symbol color, size and shape.

ggplot2 cheatsheet example plot 18

Symbols

Plot text and annotation

Add text in chart

#> Not Run: gg + geom_text(aes(xcol, ycol, label=round(labelCol), size=3))  # general format
gg + geom_text(aes(label=color, color=color), size=4) 

ggplot2 cheatsheet example plot 19

Annotation

#> gg3 + annotate("mytext", x=xpos, y=ypos, label="My text")  # Not run: General Format
library(grid)
my_grob = grobTree(textGrob("My Custom Text", x=0.8, y=0.2, gp=gpar(col="firebrick", fontsize=25, fontface="bold")))
gg3 + annotation_custom(my_grob)

ggplot2 cheatsheet example plot 20

Multiple plots

Multiple chart panels

p1 <- gg1 + facet_grid(color ~ cut)  # arrange in a grid. More space for plots.

Free X and Y axis scales

By setting scales='free', the scales of both X and Y axis is freed. Use scales='free_x' to free only X-axis and scales='free_y' to free only Y-axis.

p2 <- gg1 + facet_wrap(color ~ cut, scales="free")  # free the x and y axis scales.

Arrange multiple plots

library(gridExtra)
grid.arrange(p1, p2, ncol=2)

ggplot2 cheatsheet example plot 21

Geom layers

Add smoothing line

gg3 + geom_smooth(aes(color=color))  # method could be - 'lm', 'loess', 'gam'

ggplot2 cheatsheet example plot 22

Add horizontal / vertical line

p1 <- gg3 + geom_hline(yintercept=5000, size=2, linetype="dotted", color="blue") # linetypes: solid, dashed, dotted, dotdash, longdash and twodash
p2 <- gg3 + geom_vline(xintercept=4, size=2, color="firebrick")
p3 <- gg3 + geom_segment(aes(x=4, y=5000, xend=4, yend=10000, size=2, lineend="round"))
p4 <- gg3 + geom_segment(aes(x=carat, y=price, xend=carat, yend=price-500, color=color), size=2) + coord_cartesian(xlim=c(3, 5))  # x, y: start points. xend, yend: end points
gridExtra::grid.arrange(p1,p2,p3,p4, ncol=2)

ggplot2 cheatsheet example plot 23

Add bar chart

# Frequency bar chart: Specify only X axis.
gg <- ggplot(mtcars, aes(x=cyl))
gg + geom_bar()  # frequency table

ggplot2 cheatsheet example plot 24

gg <- ggplot(mtcars, aes(x=cyl))
p1 <- gg + geom_bar(position="dodge", aes(fill=factor(vs)))  # side-by-side
p2 <- gg + geom_bar(aes(fill=factor(vs)))  # stacked
gridExtra::grid.arrange(p1, p2, ncol=2)

ggplot2 cheatsheet example plot 25

# Absolute bar chart: Specify both X adn Y axis. Set stat="identity"
df <- aggregate(mtcars$mpg, by=list(mtcars$cyl), FUN=mean)  # mean of mpg for every 'cyl'
names(df) <- c("cyl", "mpg")
head(df)
#>   cyl    mpg
#> 1   4  26.66
#> 2   6  19.74
#> 3   8  15.10

gg_bar <- ggplot(df, aes(x=cyl, y=mpg)) + geom_bar(stat = "identity")  # Y axis is explicit. 'stat=identity'
print(gg_bar)

ggplot2 cheatsheet example plot 26

Distinct color for bars

gg_bar <- ggplot(df, aes(x=cyl, y=mpg)) + geom_bar(stat = "identity", aes(fill=cyl))
print(gg_bar)

ggplot2 cheatsheet example plot 27

Change color and width of bars

df$cyl <- as.factor(df$cyl)
gg_bar <- ggplot(df, aes(x=cyl, y=mpg)) + geom_bar(stat = "identity", aes(fill=cyl), width = 0.25)
gg_bar + scale_fill_manual(values=c("4"="steelblue", "6"="firebrick", "8"="darkgreen"))

ggplot2 cheatsheet example plot 28

Change color palette

library(RColorBrewer)
display.brewer.all(n=20, exact.n=FALSE)  # display available color palettes
ggplot(mtcars, aes(x=cyl, y=carb, fill=factor(cyl))) + geom_bar(stat="identity") + scale_fill_brewer(palette="Reds")  # "Reds" is palette name

ggplot2 cheatsheet example plot 29

Line chart

# Method 1:
gg <- ggplot(economics, aes(x=date))  # setup
gg + geom_line(aes(y=psavert), size=2, color="firebrick") + geom_line(aes(y=uempmed), size=1, color="steelblue", linetype="twodash")  # No legend
# available linetypes: solid, dashed, dotted, dotdash, longdash and twodash

ggplot2 cheatsheet example plot 30

# Method 2:
library(reshape2)
df_melt <- melt(economics[, c("date", "psavert", "uempmed")], id="date")  # melt by date. 
gg <- ggplot(df_melt, aes(x=date))  # setup
gg + geom_line(aes(y=value, color=variable), size=1) + scale_color_discrete(name="Legend")  # gets legend.

ggplot2 cheatsheet example plot 31

Line chart from timeseries

# One step method.
library(ggfortify)
autoplot(AirPassengers, size=2) + labs(title="AirPassengers")

ggplot2 cheatsheet example plot 32

Ribbons

Filled time series can be plotted using geom_ribbon(). It takes two compulsory arguments ymin and ymax.

# Prepare the dataframe
st_year <- start(AirPassengers)[1]
st_month <- "01"
st_date <- as.Date(paste(st_year, st_month, "01", sep="-"))
dates <- seq.Date(st_date, length=length(AirPassengers), by="month")
df <- data.frame(dates, AirPassengers, AirPassengers/2)
head(df)
#>        dates AirPassengers AirPassengers.2
#> 1 1949-01-01           112            56.0
#> 2 1949-02-01           118            59.0
#> 3 1949-03-01           132            66.0
#> 4 1949-04-01           129            64.5
#> 5 1949-05-01           121            60.5
#> 6 1949-06-01           135            67.5
# Plot ribbon with ymin=0
gg <- ggplot(df, aes(x=dates)) + labs(title="AirPassengers") + theme(plot.title=element_text(size=30), axis.title.x=element_text(size=20), axis.text.x=element_text(size=15))
gg + geom_ribbon(aes(ymin=0, ymax=AirPassengers)) + geom_ribbon(aes(ymin=0, ymax=AirPassengers.2), fill="green")

ggplot2 cheatsheet example plot 33

gg + geom_ribbon(aes(ymin=AirPassengers-20, ymax=AirPassengers+20)) + geom_ribbon(aes(ymin=AirPassengers.2-20, ymax=AirPassengers.2+20), fill="green")

ggplot2 cheatsheet example plot 34

Area

geom_area is similar to geom_ribbon, except that the ymin is set to 0. If you want to make overlapping area plot, use the alpha aesthetic to make the top layer translucent.

# Method1: Non-Overlapping Area
df <- reshape2::melt(economics[, c("date", "psavert", "uempmed")], id="date")
head(df, 3)
#>         date variable value
#> 1 1967-07-01  psavert  12.5
#> 2 1967-08-01  psavert  12.5
#> 3 1967-09-01  psavert  11.7
p1 <- ggplot(df, aes(x=date)) + geom_area(aes(y=value, fill=variable)) + labs(title="Non-Overlapping - psavert and uempmed")

# Method2: Overlapping Area
p2 <- ggplot(economics, aes(x=date)) + geom_area(aes(y=psavert), fill="yellowgreen", color="yellowgreen") + geom_area(aes(y=uempmed), fill="dodgerblue", alpha=0.7, linetype="dotted") + labs(title="Overlapping - psavert and uempmed")
gridExtra::grid.arrange(p1, p2, ncol=2)

ggplot2 cheatsheet example plot 35

Boxplot and Violin

The oulier points are controlled by the following aesthetics: * outlier.shape * outlier.stroke * outlier.size * outlier.colour

If the notch is turned on (by setting it TRUE), the below boxplot is produced. Else, you would get the standard rectangular boxplots.

p1 <- ggplot(mtcars, aes(factor(cyl), mpg)) + geom_boxplot(aes(fill = factor(cyl)), width=0.5, outlier.colour = "dodgerblue", outlier.size = 4, outlier.shape = 16, outlier.stroke = 2, notch=T) + labs(title="Box plot")  # boxplot
p2 <- ggplot(mtcars, aes(factor(cyl), mpg)) + geom_violin(aes(fill = factor(cyl)), width=0.5, trim=F) + labs(title="Violin plot (untrimmed)")  # violin plot
gridExtra::grid.arrange(p1, p2, ncol=2)

ggplot2 cheatsheet example plot 38

Density

ggplot(mtcars, aes(mpg)) + geom_density(aes(fill = factor(cyl)), size=2) + labs(title="Density plot")  # Density plot

ggplot2 cheatsheet example plot 39

Tiles

corr <- round(cor(mtcars), 2)
df <- reshape2::melt(corr)
gg <- ggplot(df, aes(x=Var1, y=Var2, fill=value, label=value)) + geom_tile() + theme_bw() + geom_text(aes(label=value, size=value), color="white") + labs(title="mtcars - Correlation plot") + theme(text=element_text(size=20), legend.position="none")

library(RColorBrewer)
p2 <- gg + scale_fill_distiller(palette="Reds")
p3 <- gg + scale_fill_gradient2()
gridExtra::grid.arrange(gg, p2, p3, ncol=3)

ggplot2 cheatsheet example plot 40