plot(iris)
Applied Statistics – A Practical Course
2024-12-16
The examples were tested with R 4.3.x and R 4.4.2
Data sets of this presentation:
The `radiation.csv’ data set contains derived data from the German Weather Service (http://www.dwd.de), station Dresden-Klotzsche. Missing data were interpolated.
If a data set is missing, please let me know.
iris
is a built-in data set in R (see next slide)plot
is a so-called generic function that automatically decides how to plot.The famous (Fisher’s or Anderson’s) iris
data set contains measurements (in centimeter) of the variables sepal length, sepal width, petal length and petal width of 50 flowers from each of 3 species of iris, Iris setosa, I. versicolor, and I. virginica.
?iris
in R’s online help.A column of a data.frame
is accessed with $
.
with()
saves dollarsR allows to change style and color of plotting symbols:
col
: color, can be one of 8 default colors or a user-defined colorpch
: plotting character, can be one of 25 symbols or a quoted lettercex
: character extension: size of a plotting characterlwd
: border width of the symbollty
: line type (“blank”, “solid”, “dashed”, “dotted”, “dotdash”, “longdash”, “twodash”) or a number from 1…7, or a string with up to 8 numbers for drawing and skipping (e.g. “4224”).lwd
: line width (a number, defaults to 1)col=iris$Species
: works because Species
is a factorlas=1
: numbers on y-axis upright (try: 0, 1, 2 or 3)log
: may be used to transform axes (e.g. log=“x”, log=“y”, log=“xy”)mycolors <- c("blue", "red", "cyan")
plot(iris$Sepal.Length, iris$Petal.Length, xlim=c(0, 8), ylim=c(2,8),
col=mycolors[iris$Species], pch = 16,
xlab="Sepal Length (cm)", ylab="Petal Length (cm)", main="Iris Data",
las = 1)
legend("topleft", legend=c("Iris setosa", "Iris versicolor", "Iris virginica"),
col=mycolors, pch=16)
?legend
for more options (e.g. line styles, position of the legend)par()
par(lwd=2)
all lines have double widthpar(mfrow=c(2,2))
subdivides the graphics area in 2 x 2 fieldspar(las=1)
numbers at y axis uprightpar(mar=c(5, 5, 0.5, 0.5))
changes figure margins (bottom, left, top, right)par(cex=2)
increase font sizeRead the ?par
help page!
cex
), margins (mar
) and axis label orientation (las
)opar
stores previuos parameter and allows resettingFormat | Type | Usage | Notes |
---|---|---|---|
PNG | bitmap | general purpose | fixed size, use at least 300 pixels per inch |
JPEG | bitmap | photographs | not good for R images |
TIFF | bitmap | PNG is easier | outdated, required by some journals |
BMP | bitmap | not recommended | outdated, needs huge memory |
Metafile | vector | Windows standard format | easy to use, quality varies |
SVG | vector | can be edited | allows editing with Inkscape |
EPS | vector | PDF is easier | required by some journals |
vector | best quality | perfect for LaTex, RMarkdown and Quarto, MS Office requires conversion |
Bitmap formats
Vector formats
\(\rightarrow\) Inkscape, SumatraPDF, ImageMagick
res
to change nominal resolution and font sizelibrary(ggplot2)
data(iris)
# define a theme with user-specified font sizes
figure_theme <- theme(
axis.text = element_text(size = 12),
axis.title = element_text(size = 12, face = "bold"),
legend.title = element_text(size = 12, face = "bold"),
legend.text = element_text(size = 12))
# ggplots can be stored in a variable
p <- iris |>
ggplot(aes(Petal.Length, Petal.Width, colour = Species)) +
geom_point() + figure_theme
Print to a file:
Print to the screen:
More about themes can be found in the books of Chang (2024) and Wickham et al. (in press).
Note: The data set contains derived data from the German Weather Service (http://www.dwd.de), station Dresden. Missing data were interpolated.
as.Date
(dates only)as.POSIXct
(date and time)format
and strptime
tseries
and zoo
%Y | year with century |
%m | month as decimal number |
%d | day of the month |
%H | hours as decimal number (00-23) |
%M | minute as decimal number (00-59) |
%S | second as decimal number (00-59) |
%j | day of year (001-366) |
%u | weekday, Monday is 1 |
radiation$year <- format(radiation$Date, "%Y")
radiation$month <- format(radiation$Date, "%m")
radiation$doy <- format(radiation$Date, "%j")
radiation$weekday <- format(radiation$Date, "%u")
head(radiation)
date rad interpolated Date year month doy weekday
1 1981-01-01 197 0 1981-01-01 1981 01 001 4
2 1981-01-02 89 0 1981-01-02 1981 01 002 5
3 1981-01-03 49 0 1981-01-03 1981 01 003 6
4 1981-01-04 111 0 1981-01-04 1981 01 004 7
5 1981-01-05 161 0 1981-01-05 1981 01 005 1
6 1981-01-06 55 0 1981-01-06 1981 01 006 2
aggregate
Syntax
Example
apply
Most functions that support a formula argument (containing ~
) allow to specify the data frame with a data
argument.
ggplot
-ExampleMore presentations
Books
Manuals
More details in the official R manuals, especially in An Introduction to R
Videos
Many videos can be found on Youtube, at the Posit webpage and somewhere else.
This tutorial was made with Quarto
Contact
Author: tpetzoldt +++ Homepage +++ Github page