Histograms and Bar plots
Syntax
The relevant commands here are
hist
andhist!
,bar
andbar!
.
The general syntax is:
bar(data_to_plot...; options...)
Data formats
For bars, the situation is pretty much identical as for line and scatter plots (see here) with the exception of an implicit function (not allowed for bar or hist).
For instance:
data = [1 2; 1 2; 5 7; 2 3]
bar(data)
For histograms, one difference is that they can only be drawn one at the time so that the syntax is always hist(x; opts...)
where x
is a vector:
For instance:
data = exp.(randn(200)/5)
hist(data; nbins=20)
Styling options
General histogram options
- horizontal [
horiz
orhorizontal
]: takes a boolean indicating the orientation of the histogram.
data = randn(100)
hist(data; horiz=true)
number of bins [
bins
ornbins
]: takes a positive integer indicating the number of bins that should be used (default uses Sturges' formula).scaling [
scaling
]: takes a string describing how the bins should be scaled.
Value | Comment |
---|---|
"none" or "count" | number of entries in a range |
"pdf" | area covered by the bins equals one |
"prob" or "probability" | count divided by the overall number of entries |
If you want to add a probability density function plot on top of a histogram, pdf
is usually the scaling you will want.
x = randn(500)
hist(x; nbins=50, scaling="pdf")
plot!(x -> exp(-x^2/2)/sqrt(2π), -3, 3)
General bar options
- horizontal [
horiz
orhorizontal
]: same as for histograms. - stacked [
stacked
]: takes a boolean indicating whether to stack the bars (true
) or put them side by side (false
) when drawing multiple bars. Note that when stacking bars, it is expected that subsequent bars are increasing (so for instance7,8,10
and not7,5,10
); see the example below:
# percentages
data = [30 40 30; 50 25 25; 30 30 40; 10 50 40]
# cumulative sum so that columns increase
data_cs = cumsum(data; dims=2)
bar(data_cs; stacked=true, fills=["midnightblue", "lightseagreen", "lightsalmon"])
- bar width [
width
,bwidth
orbarwidth
]: takes a positive number indicating the width of the bars.
data = [10, 50, 30]
bar(data; width=1, fill="hotpink")
Bar style options (Bar and Histogram)
Both histograms and bars share styling options for the style of the bars (essentially: their edge and fill colour). Note that since bars can be drawn in groups, each option can take a vector of values corresponding to the number of bars drawn. If a single value is passed, all bars will share that option value.
- edge colour [
ecol
,edgecol
,edgecolor
,ecols
,edgecols
oredgecolors
]: takes a colour for the edge of the bars. If the edge colour is specified but not the fill colour, then the fill colour is set to white.
hist(randn(100); col="powderblue")
- fill colour [
col
,color
,cols
,colors
,fill
orfills
]: takes a colour for the filling of the bars. If the fill colour is specified but not the edge colour, then the edge colour is set to white.
hist(randn(100); ecol="red", fill="wheat")
Notes
Missing, Inf or NaN values
- For histograms, only
missing
values are allowed, attempting to plot a histogram withInf
orNaN
will throw an error, if you still want to do it, you should pre-filter your vector of values before trying to display it. - For bars, the same rule as for
plot
applies: these values will be ignored (meaning that some bar will not show).
Modifying the underlying data
The same comment as the one made in line and scatter plots holds for in-place modification of the data.