ggplot2 geom_point() in R: Scatter Plots With Examples

The geom_point() function in ggplot2 draws a scatter plot, mapping x and y values to point positions. You can map additional variables to color, size, shape, or transparency to encode more information per point.

⚡ Quick Answer
ggplot(df, aes(x, y)) + geom_point()                       # basic scatter
ggplot(df, aes(x, y, color = group)) + geom_point()        # color by group
ggplot(df, aes(x, y, size = z)) + geom_point()             # size by variable
ggplot(df, aes(x, y, shape = group)) + geom_point()        # shape by group
ggplot(df, aes(x, y)) + geom_point(alpha = 0.5)            # transparency
ggplot(df, aes(x, y)) + geom_jitter(width = 0.2)           # jittered points
ggplot(df, aes(x, y)) + geom_point() + facet_wrap(~ group) # one panel per group

Need explanation? Read on for examples and pitfalls.

📊 Is geom_point() the right tool?
STARTtwo continuous variablesgeom_point()one continuous + many overlapping pointsgeom_jitter() or alphadiscrete x with overlapgeom_jitter()show density of pointsgeom_hex() or geom_density_2d()connect points with linesgeom_line() (with geom_point on top)show mean +/- error barsgeom_point() + geom_errorbar()add a regression linegeom_point() + geom_smooth()

What geom_point() does in one sentence

geom_point() maps each row of data to one point on a 2D plane, with x and y positions controlled by aesthetics. Additional aesthetics like color, size, shape, and alpha let you encode extra dimensions: a fifth variable can become point color, a sixth point size, etc.

It is the default tool for showing relationships between two continuous variables, the workhorse of exploratory data analysis, and the foundation for many compound plots (scatter + regression line, scatter + facets, bubble charts).

Syntax

geom_point() is a layer added to a ggplot() call. The minimum is aes(x, y) mapping. Additional aesthetics go either inside aes() (variable-mapped) or outside (constant for all points).

Run live
Run live, no install needed. Every R block on this page runs in your browser. Click Run, edit the code, re-run instantly. No setup.
RLoad ggplot2 and inspect mtcars
library(ggplot2) head(mtcars)[, c("mpg", "wt", "cyl")] #> mpg wt cyl #> Mazda RX4 21.0 2.620 6 #> Mazda RX4 Wag 21.0 2.875 6 #> Datsun 710 22.8 2.320 4

  

The full signature:

geom_point(mapping = NULL, data = NULL, stat = "identity", position = "identity",
           ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE)

Most common usage just passes inherited aesthetics from the parent ggplot() call. Override specific aesthetics for layer-specific control.

Tip
Aesthetics inside aes() map data; aesthetics outside aes() apply uniformly. geom_point(aes(color = cyl)) colors each point by cyl. geom_point(color = "blue") paints every point blue. Mixing them up is the most common ggplot2 mistake.

Seven common patterns

1. Basic scatter plot

RWeight vs mpg
ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point()

  

The minimum: ggplot() declares the data and aesthetics; geom_point() adds the layer.

2. Color by group

RColor points by cylinder count
ggplot(mtcars, aes(x = wt, y = mpg, color = factor(cyl))) + geom_point(size = 3)

  

color = factor(cyl) maps cyl to discrete colors. Wrapping in factor() ensures discrete (categorical) coloring. Without factor(), ggplot treats cyl as continuous and uses a gradient.

3. Size by variable (bubble chart)

RBubble chart with hp as size
ggplot(mtcars, aes(x = wt, y = mpg, size = hp)) + geom_point(alpha = 0.6)

  

Mapping size = hp makes each point's diameter proportional to horsepower. Combined with alpha = 0.6, overlapping points stay visible.

4. Shape by group

RDifferent shapes per cyl group
ggplot(mtcars, aes(x = wt, y = mpg, shape = factor(cyl))) + geom_point(size = 3)

  

Use shape for accessible categorical encoding when colors might not print well or for colorblind-friendly plots.

5. Transparency for overplotting

Ralpha = 0.4 reveals point density
ggplot(diamonds, aes(x = carat, y = price)) + geom_point(alpha = 0.1)

  

In dense data (53,940 diamonds), full-opacity points overlap and obscure density. alpha = 0.1 makes individual points faint; clusters appear darker because many overlap.

6. Jittered points for discrete x

RJitter to reveal overlapping points
ggplot(mpg, aes(x = class, y = hwy)) + geom_jitter(width = 0.2, alpha = 0.5)

  

geom_jitter() adds a small random offset to each point. Without jitter, points at identical x values would stack on top of each other.

7. Faceted scatter

ROne panel per cyl
ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point() + facet_wrap(~ cyl) + theme_minimal()

  

facet_wrap() creates one panel per unique value of cyl. Each panel has its own x and y axis (default shared). Useful for comparing relationships across groups.

Key Insight
Aesthetic mappings live in aes(); constants live outside. geom_point(aes(color = cyl), size = 3) has color from data and constant size 3. Constants (size = 3) do not need aesthetics; they are just R values. Mappings (color = cyl) need aes() so ggplot looks them up in the data.

geom_point() vs base R plot()

Base R plot() is one-shot; ggplot2 geom_point() is composable. Base R is faster for quick interactive plots; ggplot2 is more flexible for publication graphics.

Task ggplot2 Base R
Basic scatter ggplot(df, aes(x, y)) + geom_point() plot(df$x, df$y)
Color by group aes(color = grp) plot(..., col = factor(grp))
Size by variable aes(size = z) symbols(x, y, circles = z)
Add regression line + geom_smooth(method = "lm") abline(lm(y ~ x))
Faceting + facet_wrap(~ grp) (manual: par(mfrow=...))
Save to file ggsave("out.png") png(); plot(); dev.off()

When to use which:

  • Use ggplot2 for layered, publication-quality, faceted plots.
  • Use base R plot() for one-line interactive exploration.

Common pitfalls

Pitfall 1: continuous variable in color. aes(color = cyl) (without factor()) makes ggplot treat cyl as continuous and use a gradient palette, even though cyl has only 3 values. Wrap in factor(cyl) for discrete colors.

Pitfall 2: aesthetic vs constant confusion. geom_point(color = "red") paints all points red. geom_point(aes(color = "red")) creates a fake column "red" and colors by that (resulting in one color, with a confusing legend). The correct constant form has color OUTSIDE aes().

Warning
Excessive overplotting hides patterns. With dense data (10K+ points), full-opacity geom_point can look like a solid blob. Always consider alpha, geom_hex(), geom_density_2d(), or sampling. Without these, you may miss interesting structure.

Pitfall 3: jitter changes x values. geom_jitter(width = 0.5) moves each point up to 0.5 units left or right of its true x. For categorical x this is fine; for continuous x where exact values matter, use geom_point with alpha instead.

Try it yourself

Try it: Build a scatter plot of mtcars with wt on x, mpg on y, points colored by factor(cyl), and a linear regression line. Save the plot to ex_plot.

RYour turn: scatter + regression by group
# Try it: geom_point + geom_smooth ex_plot <- ggplot(mtcars, aes(x = wt, y = mpg, color = factor(cyl))) + # your code here print(ex_plot) #> Expected: scatter plot with 3 colors and a fitted line per group

  
Click to reveal solution
RSolution
ex_plot <- ggplot(mtcars, aes(x = wt, y = mpg, color = factor(cyl))) + geom_point(size = 3) + geom_smooth(method = "lm", se = FALSE) print(ex_plot)

  

Explanation: geom_point(size = 3) draws the colored points; geom_smooth(method = "lm", se = FALSE) fits a linear regression LINE per group (because color is mapped). se = FALSE hides the confidence band for clarity.

After mastering geom_point(), look at:

  • geom_jitter(): scatter with random offset (for overlapping discrete x)
  • geom_smooth(): trend line on top of scatter
  • geom_text(), geom_label(): add text annotations to points
  • geom_hex(), geom_density_2d(): density alternatives for big data
  • scale_color_manual(), scale_color_brewer(): customize point colors
  • aes(color, size, shape, alpha): the core aesthetics for scatter plots

For maps with geographic points, geom_sf() works similarly with sf objects. For interactive scatter plots, wrap with plotly::ggplotly().

FAQ

How do I change the color of all points in geom_point?

Use geom_point(color = "blue") (color OUTSIDE aes()). To map color from a column: aes(color = column_name) INSIDE the aes() call.

How do I make points larger or smaller in ggplot2?

For uniform size: geom_point(size = 3). To map size to a variable: geom_point(aes(size = column_name)). Default size is 1.5; values 2 to 4 are typical for clear plots.

How do I deal with overlapping points in geom_point?

Three common fixes. Use alpha = 0.3 for transparency. Use geom_jitter() for small random offsets (best for categorical x). Switch to geom_hex() or geom_density_2d() for very dense data.

Can I use geom_point with a categorical x axis?

Yes, but points stack at each x value. Use geom_jitter() instead, or set position = position_jitter(width = 0.2) inside geom_point() to spread points out at each category.

How do I add a regression line to a scatter plot in ggplot2?

Add geom_smooth(method = "lm") after geom_point(). For LOESS smoothing instead, use geom_smooth(method = "loess") or just geom_smooth() (default). Add se = FALSE to hide the confidence band.