Skip to content

Commit

Permalink
Expand on docs/example for cases with non-equal-width bins in `stat_b…
Browse files Browse the repository at this point in the history
…in()` (#6151)

* Docs/example for non-equal-width bins

* Update R/geom-histogram.R

Co-authored-by: Teun van den Brand <[email protected]>

* Update R/geom-histogram.R

Co-authored-by: Teun van den Brand <[email protected]>

* move (count / width) note to details

---------

Co-authored-by: Teun van den Brand <[email protected]>
  • Loading branch information
mattansb and teunbrand authored Oct 25, 2024
1 parent 5e62f0c commit f12b73d
Show file tree
Hide file tree
Showing 2 changed files with 36 additions and 0 deletions.
18 changes: 18 additions & 0 deletions R/geom-histogram.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,12 @@
#' one change at a time. You may need to look at a few options to uncover
#' the full story behind your data.
#'
#' By default, the _height_ of the bars represent the counts within each bin.
#' However, there are situations where this behavior might produce misleading
#' plots (e.g., when non-equal-width bins are used), in which case it might be
#' preferable to have the _area_ of the bars represent the counts (by setting
#' `aes(y = after_stat(count / width))`). See example below.
#'
#' In addition to `geom_histogram()`, you can create a histogram plot by using
#' `scale_x_binned()` with [geom_bar()]. This method by default plots tick marks
#' in between each bar.
Expand Down Expand Up @@ -63,6 +69,18 @@
#' ggplot(diamonds, aes(price, after_stat(density), colour = cut)) +
#' geom_freqpoly(binwidth = 500)
#'
#'
#' # When using the non-equal-width bins, we should set the area of the bars to
#' # represent the counts (not the height).
#' # Here we're using 10 equi-probable bins:
#' price_bins <- quantile(diamonds$price, probs = seq(0, 1, length = 11))
#'
#' ggplot(diamonds, aes(price)) +
#' geom_histogram(breaks = price_bins, color = "black") # misleading (height = count)
#'
#' ggplot(diamonds, aes(price, after_stat(count / width))) +
#' geom_histogram(breaks = price_bins, color = "black") # area = count
#'
#' if (require("ggplot2movies")) {
#' # Often we don't want the height of the bar to represent the
#' # count of observations, but the sum of some other variable.
Expand Down
18 changes: 18 additions & 0 deletions man/geom_histogram.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit f12b73d

Please sign in to comment.