ggplot2 geom_point() in R: Scatter Plots With Examples
The geom_point() function in ggplot2 draws a scatter plot, mapping x and y values to point positions. You can map additional variables to color, size, shape, or transparency to encode more information per point.
ggplot(df, aes(x, y)) + geom_point() # basic scatter ggplot(df, aes(x, y, color = group)) + geom_point() # color by group ggplot(df, aes(x, y, size = z)) + geom_point() # size by variable ggplot(df, aes(x, y, shape = group)) + geom_point() # shape by group ggplot(df, aes(x, y)) + geom_point(alpha = 0.5) # transparency ggplot(df, aes(x, y)) + geom_jitter(width = 0.2) # jittered points ggplot(df, aes(x, y)) + geom_point() + facet_wrap(~ group) # one panel per group
Need explanation? Read on for examples and pitfalls.
What geom_point() does in one sentence
geom_point() maps each row of data to one point on a 2D plane, with x and y positions controlled by aesthetics. Additional aesthetics like color, size, shape, and alpha let you encode extra dimensions: a fifth variable can become point color, a sixth point size, etc.
It is the default tool for showing relationships between two continuous variables, the workhorse of exploratory data analysis, and the foundation for many compound plots (scatter + regression line, scatter + facets, bubble charts).
Syntax
geom_point() is a layer added to a ggplot() call. The minimum is aes(x, y) mapping. Additional aesthetics go either inside aes() (variable-mapped) or outside (constant for all points).
The full signature:
geom_point(mapping = NULL, data = NULL, stat = "identity", position = "identity",
..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE)
Most common usage just passes inherited aesthetics from the parent ggplot() call. Override specific aesthetics for layer-specific control.
aes() map data; aesthetics outside aes() apply uniformly. geom_point(aes(color = cyl)) colors each point by cyl. geom_point(color = "blue") paints every point blue. Mixing them up is the most common ggplot2 mistake.Seven common patterns
1. Basic scatter plot
The minimum: ggplot() declares the data and aesthetics; geom_point() adds the layer.
2. Color by group
color = factor(cyl) maps cyl to discrete colors. Wrapping in factor() ensures discrete (categorical) coloring. Without factor(), ggplot treats cyl as continuous and uses a gradient.
3. Size by variable (bubble chart)
Mapping size = hp makes each point's diameter proportional to horsepower. Combined with alpha = 0.6, overlapping points stay visible.
4. Shape by group
Use shape for accessible categorical encoding when colors might not print well or for colorblind-friendly plots.
5. Transparency for overplotting
In dense data (53,940 diamonds), full-opacity points overlap and obscure density. alpha = 0.1 makes individual points faint; clusters appear darker because many overlap.
6. Jittered points for discrete x
geom_jitter() adds a small random offset to each point. Without jitter, points at identical x values would stack on top of each other.
7. Faceted scatter
facet_wrap() creates one panel per unique value of cyl. Each panel has its own x and y axis (default shared). Useful for comparing relationships across groups.
aes(); constants live outside. geom_point(aes(color = cyl), size = 3) has color from data and constant size 3. Constants (size = 3) do not need aesthetics; they are just R values. Mappings (color = cyl) need aes() so ggplot looks them up in the data.geom_point() vs base R plot()
Base R plot() is one-shot; ggplot2 geom_point() is composable. Base R is faster for quick interactive plots; ggplot2 is more flexible for publication graphics.
| Task | ggplot2 | Base R |
|---|---|---|
| Basic scatter | ggplot(df, aes(x, y)) + geom_point() |
plot(df$x, df$y) |
| Color by group | aes(color = grp) |
plot(..., col = factor(grp)) |
| Size by variable | aes(size = z) |
symbols(x, y, circles = z) |
| Add regression line | + geom_smooth(method = "lm") |
abline(lm(y ~ x)) |
| Faceting | + facet_wrap(~ grp) |
(manual: par(mfrow=...)) |
| Save to file | ggsave("out.png") |
png(); plot(); dev.off() |
When to use which:
- Use ggplot2 for layered, publication-quality, faceted plots.
- Use base R
plot()for one-line interactive exploration.
Common pitfalls
Pitfall 1: continuous variable in color. aes(color = cyl) (without factor()) makes ggplot treat cyl as continuous and use a gradient palette, even though cyl has only 3 values. Wrap in factor(cyl) for discrete colors.
Pitfall 2: aesthetic vs constant confusion. geom_point(color = "red") paints all points red. geom_point(aes(color = "red")) creates a fake column "red" and colors by that (resulting in one color, with a confusing legend). The correct constant form has color OUTSIDE aes().
alpha, geom_hex(), geom_density_2d(), or sampling. Without these, you may miss interesting structure.Pitfall 3: jitter changes x values. geom_jitter(width = 0.5) moves each point up to 0.5 units left or right of its true x. For categorical x this is fine; for continuous x where exact values matter, use geom_point with alpha instead.
Try it yourself
Try it: Build a scatter plot of mtcars with wt on x, mpg on y, points colored by factor(cyl), and a linear regression line. Save the plot to ex_plot.
Click to reveal solution
Explanation: geom_point(size = 3) draws the colored points; geom_smooth(method = "lm", se = FALSE) fits a linear regression LINE per group (because color is mapped). se = FALSE hides the confidence band for clarity.
Related ggplot2 functions
After mastering geom_point(), look at:
geom_jitter(): scatter with random offset (for overlapping discrete x)geom_smooth(): trend line on top of scattergeom_text(),geom_label(): add text annotations to pointsgeom_hex(),geom_density_2d(): density alternatives for big datascale_color_manual(),scale_color_brewer(): customize point colorsaes(color, size, shape, alpha): the core aesthetics for scatter plots
For maps with geographic points, geom_sf() works similarly with sf objects. For interactive scatter plots, wrap with plotly::ggplotly().
FAQ
How do I change the color of all points in geom_point?
Use geom_point(color = "blue") (color OUTSIDE aes()). To map color from a column: aes(color = column_name) INSIDE the aes() call.
How do I make points larger or smaller in ggplot2?
For uniform size: geom_point(size = 3). To map size to a variable: geom_point(aes(size = column_name)). Default size is 1.5; values 2 to 4 are typical for clear plots.
How do I deal with overlapping points in geom_point?
Three common fixes. Use alpha = 0.3 for transparency. Use geom_jitter() for small random offsets (best for categorical x). Switch to geom_hex() or geom_density_2d() for very dense data.
Can I use geom_point with a categorical x axis?
Yes, but points stack at each x value. Use geom_jitter() instead, or set position = position_jitter(width = 0.2) inside geom_point() to spread points out at each category.
How do I add a regression line to a scatter plot in ggplot2?
Add geom_smooth(method = "lm") after geom_point(). For LOESS smoothing instead, use geom_smooth(method = "loess") or just geom_smooth() (default). Add se = FALSE to hide the confidence band.