xyplot {lattice}R Documentation

Common Bivariate Trellis Plots

Description

These are the most commonly used Trellis functions to look at pairs of variables. By far the most common is xyplot, designed mainly for two continuous variates, which produces Conditional Scatterplots. The others are useful when one of the variates is a factor. See details below.

Most of the arguments documented here are also applicable for many of the other Trellis functions. These are not described in any detail elsewhere, and this should be considered the canonical documentation for such arguments.

Note that any arguments passed to these functions and not recognized by them will be passed to the panel function. Most predefined panel functions have arguments that customize its output. These arguments are described only in the help pages for these panel functions, but can usually be supplied as arguments to the high level plot.

Usage

xyplot(formula,
       data = parent.frame(),
       panel = "panel.xyplot",
       aspect = "fill",
       as.table = FALSE,
       between,
       groups,
       key,
       layout,
       main,
       page,
       par.strip.text,
       prepanel,
       scales,
       skip,
       strip = "strip.default",
       sub,
       xlab,
       xlim,
       ylab,
       ylim,
       ...,
       subscripts,
       subset)
barchart(formula, data, panel = "panel.barchart", box.ratio = 2,
         horizontal, ...)
bwplot(formula, data, panel = "panel.bwplot", box.ratio = 1,
       horizontal, ...)
dotplot(formula, data, panel = "panel.dotplot", ...)
stripplot(formula, data, panel = "panel.stripplot",
          jitter = FALSE, factor = .5, box.ratio, ...)

Arguments

formula a formula describing the form of conditioning plot. The formula is generally of the form y ~ x | g1 * g2 * ..., indicating that plots of y (on the y axis) versus x (on the x axis) should be produced conditional on the given variables g1,g2,.... However, the given variables g1,g2,... may be omitted. For Splus compatibility, the formula can also be written as y ~ x | g1 + g2 + ....
The given variables g1,g2,... must be either factors or shingles (Shingles are a way of processing numeric variables for use in conditioning. See documentation of shingle for details. Like factors, they have a `level' attribute, which is used in producing the conditioning plots). For each unique combination of the levels of the conditioning variables g1, g2, ..., a separate panel is produced using the points (x,y) for the subset of the data defined by that combination.
Numeric conditioning variables are converted to shingles by the function shingle (however, using equal.count might be more appropriate in many cases) and character vectors are coerced to factors. The formula can involve expressions, e.g. sqrt(),log().
The x and y variables both need to be numeric in xyplot (they are coerced to numeric if not). In the other four functions documented here, exactly one of x and y need to be numeric, and the other a factor or shingle. Which of these will happen is determined by the horizontal argument — if horizontal=TRUE, then y will be coerced to be a factor or shingle, otherwise x. The default value of horizontal is FALSE is x is a factor or shingle, TRUE otherwise. (The functionality provided by horizontal=FALSE is not S-compatible.)
All points with at least one of its values missing (NA) in any of the variates involved are omitted from the plot.
data a data frame containing values for any variables in the formula, as well as groups and subset if applicable. By default the environment where the function was called from is used.
box.ratio gives the ratio of the width of the rectangles to the inter rectangle space.
horizontal logical, applicable to bwplot, dotplot, barchart and stripplot. Determines which of x and y is to be a factor or shingle (y if TRUE, x otherwise). Defaults to FALSE if x is a factor or shingle, TRUE otherwise. This argument is used to process the arguments to these high level functions, but more importantly, it is passed as an argument to the panel function, which is supposed to use it as approporiate.
A potentially useful component of scales is this case might be abbreviate=TRUE, in which case long labels which would usually overlap will be abbreviated. scales could also contain a minlength argument in this case, which would be passed to the abbreviate function.
jitter logical specifying whether the values should be jittered by adding a random noise in stripplot.
factor numeric controlling amount of jitter. Inverse effect compared to S ?
panel Once the subset of rows defined by each unique combination of the levels of the grouping variables are obtained (see above), the corresponding x and y variables (or some other variables, as appropriate, in the case of other functions) are passed on to be plotted in each panel. The actual plotting is done by the function specified by the panel argument. Each high level function has its own default panel function.
The panel function can be a function object or a character string giving the name of a predefined function. (The latter is preferred when possible, especially when the trellis object returned by the high level function is to be stored and plotted later.)
Much of the power of Trellis Graphics comes from the ability to define customized panel functions. A panel function appropriate for the functions described here would usually expect arguments named x and y, which would be provided by the conditioning process. It can also have other arguments. (It might be useful to know in this context that all arguments passed to a high level Trellis function such as xyplot that are not recognized by it are passed through to the panel function. It is thus generally good practice when defining panel functions to allow a ... argument.) Such extra arguments typically control graphical parameters, but other uses are also common. See documentation for individual panel functions for specifics.
Technically speaking, panel functions must be written using Grid graphics functions. However, knowledge of Grid is usually not necessary to construct new custom panel functions, there are several predefined panel functions which can help; for example, panel.grid, panel.loess etc. There are also some grid-compatible replacements of base R graphics functions useful for this purpose, such as llines. (Note that the corresponding base R graphics functions like lines would not work.) These are usually sufficient to convert existing custom panel functions written for S-Plus.
One case where a bit more is required of the panel function is when the groups argument is not null. In that case, the panel function should also accept arguments named groups and subscripts (see below for details). A very useful panel function predefined for use in such cases is panel.superpose, which can be combined with different panel.groups functions. See the examples section for an interaction plot constructed this way.
panel.xyplot has an argument called type which is often useful (see separate documentation). panel functions for bwplot and friends should have an argument called horizontal to account for the cases when x is the factor or shingle.
aspect controls physical aspect ratio of the panels (same for all the panels). It can be specified as a ratio (vertical size/horizontal size) or as a character string. Legitimate values are "fill" (the default) which tries to make the panels as big as possible to fill the available space, and "xy", which tries to compute the aspect based on the 45 degree banking rule (see Visualizing Data by William S. Cleveland for details).
Of the available functions, banking is sensible only for xyplot. If a prepanel function is specified, the results are used to compute the aspect, otherwise some internal computations are done inside each function. While this is allowed for all functions, it's behaviour is not defined for any function other than xyplot (usually an aspect ratio of 1 results in such cases).
The current implementation of banking is not very sophisticated, but is not totally vague either. See banking for details.
as.table logical that controls the order in which panels should be plotted: if FALSE, panels are drawn left to right, bottom to top (graph), if TRUE, left to right, top to bottom (matrix).
between a list with components x and y (both usually 0 by default), numeric vectors specifying the space between the panels (units are character heights). x and y are repeated to account for all panels in a page and any extra components are ignored. The result is used for all pages in a multipage display. (In other words, it is not possible to use different between values for different pages).
groups used typically with panel=panel.superpose to allow display controls (color, lty etc) to vary according to a grouping variable. Formally, if groups is specified, then groups along with subscripts is passed to the panel function, which is expected to handle these arguments.
key A list of arguments that define a legend to be drawn on the plot.
The position of the legend can be controlled in either of two possible ways. If a component called space is present, the key is positioned outside the plot region, in one of the four sides, determined by the value of space, which can be one of ``top'', ``bootom'', ``left'' and ``right''. Alternately, the key can be positioned inside the plot region by specifying components x,y and corner. x and y determine the location of the corner of the key given by corner, which can be one of c(0,0), c(1,0), c(1,1),c(0,1), which denote the corners of the unit square. x and y must be numbers between 0 and 1, giving coordinates with respect to the whole display area.
The key essentially consists of a number of columns, possibly divided into blocks, each containing some rows. The contents of the key are determined by (possibly repeated) components named ``rectangles'', ``lines'', ``points'' or ``text''. Each of these must be lists with relevant graphical parameters (see later) controlling their appearance. The key list itself can contain graphical parameters, these would be used if relevant graphical components are omitted from the other components.
The length (number of rows) of each such column is taken to be the largest of the lengths of the graphical components, including the ones specified outside. The ``text'' component has to have a character vector as its first component, and the length of this vector determines the number of rows.
The graphical components that can be included in key (and also in the components named ``text'', ``lines'', ``points'' and ``rectangles'' when appropriate) are cex=1, col="black", lty=1, lwd=1, font=1, pch=8, adj=0, type="l", size=5, angle=0, density=-1. adj, angle, density are unimplemented. size determines the width of columns of rectangles and lines in character widths. type is relevant for lines; `"l"' denotes a line, `"p"' denotes a point, and `"b"' and `"o"' both denote both together.
Other possible components of key are:
between: numeric vector giving the amount of space (character widths) surrounding each column (split equally on both sides),
title: character, title of the key,
cex.title
background: defaults to default background
border:color of border, black if TRUE, defaluts to FALSE (no border drawn)
transparent=FALSE: logical, whether key area should be cleared
columns: the number of columns column-blocks the key is to be divided into, which are drawn side by side.
betwen.columns: Space between column blocks, in addition to between.
divide Number of point symbols to divide each line when type is `"b"' or `"o"' in lines.
layout In general, a Trellis conditioning plot consists of several panels arranged in a rectangular array, possibly spanning multiple pages. layout determines this arrangement.
layout is a numeric vector giving the number of columns, rows and pages in a multipanel display. By default, the number of columns is determined by the number of levels in the first given variable; the number of rows is the number of levels of the second given variable. If there is one given variable, the default layout vector is c(0,n) , where n is the number of levels of the given vector. Any time the first value in the layout vector is 0 , the second value is used as the desired number of panels per page and the actual layout is computed from this, taking into account the aspect ratio of the panels and the device dimensions (via par("din")). The number of pages is by default set to as many as is required to plot all the panels. In general, giving a high value of layout[3] is not wasteful because blank pages are never created.
main character string for main title to be placed on top of each page. Defaults to NULL. Can be a character string, or a list with components label, cex, col, font. The label tag can be omitted if it is the first element of the list.
page a function of one argument (page number) to be called after drawing each page. The function must be `grid-compliant', and is called with the whole display area as the default viewport.
par.strip.text list of graphical parameters to control the strip text, possible components are col, cex, font, lines. The first three control graphical parameters while the last is a means of altering the height of the strips. This can be useful, for example, if the strip labels (derived from factor levels, say) are double height (i.e., contains ``\n''-s) or if the default height seems too small or too large.
prepanel function that takes arguments x,y (usually) and returns a list containing four components xlim, ylim, dx, dy. If xlim and ylim are not explicitly specified (possibly as components in scales), then the actual limits of the panels are guaranteed to include the limits returned by the prepanel function. This happens globally if the relation component of scales is "same", and on a panel by panel basis otherwise.
The dx and dy components are used for banking computations in case aspect is specified as "xy". See documentation for the function banking for details regarding how this is done.
The return value of the prepanel function need not have all the components named above; in case some are missing, they are replaced by the usual componentwise defaults.
scales list determining how the x- and y-axes (tick marks and labels) are drawn. The list contains parameters in name=value form, and may also contain two other lists called x and y of the same form (described below). Components of x and y affect the respective axes only, while those in scales affect both. (When parameters are specified in both lists, the values in x or y are used.) The components are :
relation : determines limits of the axis. Possible values are "same" (default), "free" and "sliced". For relation="same", the same limits (determined by xlim, ylim, scales$limits etc) are used for all the panels. For relation="free", limits for each panel is determined by the points in that panel. Behaviour for relation = "sliced" is similar, except for that the length (max - min) of the scales are constrained to remain the same across panels. The values of xlim etc, even if specified explicitly, are ignored if relation is different from "same".
tick.number: Suggested number of ticks.
draw = TRUE: logical, whether to draw the axis at all.
alternating = TRUE/c(1,2): logical specifying whether axes alternate from one side of the group of panels to the other. For more accurate control, alternating can be a vector (replicated to be as long as the number of rows or columns per page) consisting of the possible numbers 0=do not draw, 1=bottom/left and 2=top/right. alternating applies only when relation="same".
limits: same as xlim and ylim.
at: location of tick marks along the axis (in native coordinates).
labels: Labels to go along with at
cex: factor to control character sizes for axis labels.
font: font face for axis labels (integer 1-5).
tck: factor to control length of tick marks.
col: color of ticks and labels.
rot: Angle by which the axis labels are to be rotated.
abbreviate: logical, whether to abbreviate the labels using abbreviate. Can be useful for long labels (e.g., in factors), especially on the x-axis.
minlength: argument to abbreviate is abbreviate=TRUE.
log: Use a log scale. Defaults to FALSE, other possible values are any number that works as a base for taking logarithm, TRUE, equivalent to 10, and "e" (for natural logarithm).
Note: the "axs" component is ignored. Much of the function of scales is accomplished by pscales in splom.
skip logical vector (default FALSE), replicated to be as long as the number of panels in each page. If TRUE, nothing is plotted in the corresponding panel. Useful for arranging plots in an informative manner.
strip logical flag or function. If FALSE, strips are not drawn. Otherwise, strips are drawn using the strip function, which defaults to strip.default. See documentation of strip.default to see the form of a strip function.
sub character string for a subtitle to be placed at the bottom of each page. See entry for main for finer control options.
subscripts logical specifying whether or not a vector named subscripts should be passed to the panel function. Defaults to FALSE, unless groups is specified, or if the panel function accepts an argument named subscripts. (One should be careful when defining the panel function on-the-fly.)
subset logical vector (can be specified in terms of variables in data). Everything will be done on the data points for which subset=TRUE. In case subscripts is TRUE, the subscripts will correspond to the original observations.
xlab character string giving label for the x-axis. Defaults to the expression for x in formula. Specify as NULL to omit the label altogether. Fine control is possible, see entry for sub.
xlim numeric vector of length 2 giving minimum and maximum for x-axis.
ylab character string giving label for the y-axis. Defaults to the expression for y in formula. Fine control possible, see entry for xlab.
ylim numeric vector of length 2 giving minimum and maximum for y-axis.
... other arguments, passed to the panel function

Details

These are for the most part decriptions generally applicable to all high level Lattice functions, with special emphasis on xyplot, bwplot etc. For other functions, their individual documentation should be studied in addition to this.

Author(s)

Deepayan Sarkar deepayan@stat.wisc.edu

See Also

shingle, banking, panel.xyplot, panel.bwplot, panel.barchart, panel.dotplot, panel.stripplot, panel.superpose, panel.loess, panel.linejoin, strip.default, Lattice

Examples

## Tonga Trench Earthquakes
data(quakes)
Depth <- equal.count(quakes$depth, number=8, overlap=.1)
xyplot(lat ~ long | Depth, data = quakes)

## Examples with data from `Visualizing Data' (Cleveland)
## (obtained from Bill Cleveland's Homepage :
## http://cm.bell-labs.com/cm/ms/departments/sia/wsc/, also
## available at statlib)
data(ethanol)
EE <- equal.count(ethanol$E, number=9, overlap=1/4)
## Constructing panel functions on the fly; prepanel
xyplot(NOx ~ C | EE, data = ethanol,
       prepanel = function(x, y) prepanel.loess(x, y, span = 1),
       xlab = "Compression Ratio", ylab = "NOx (micrograms/J)",
       panel = function(x, y) {
           panel.grid(h=-1, v= 2)
           panel.xyplot(x, y)
           panel.loess(x,y, span=1)
       },
       aspect = "xy")
## banking
data(sunspot)
xyplot(sunspot ~ 1:37 ,type = "l", aspect="xy",
       scales = list(y = list(log = TRUE)),
       sub = "log scales")

data(state)
## user defined panel functions
states <- data.frame(state.x77,
                     state.name = dimnames(state.x77)[[1]], 
                     state.region = state.region) 
xyplot(Murder ~ Population | state.region, data = states, 
       groups = as.character(state.name), 
       panel = function(x, y, subscripts, groups)  
       ltext(x=x, y=y, label=groups[subscripts], cex=.7, font=3))
data(barley)
barchart(yield ~ variety | year * site, data = barley, #aspect = 2.5,
         ylab = "Barley Yield (bushels/acre)",
         scales = list(x = list(0, abbreviate = TRUE,
                       minlength = 5)))
data(singer)
bwplot(voice.part ~ height, data=singer, xlab="Height (inches)")
dotplot(variety ~ yield | year * site, data=barley)
dotplot(variety ~ yield | site, data = barley, groups = year,
        panel = function(x, y, subscripts, ...) {
            dot.line <- trellis.par.get("dot.line")
            panel.abline(h = y, col = dot.line$col,
                         lty = dot.line$lty)
            panel.superpose(x, y, subscripts, ...)
        },
        key = list(space="right", transparent = TRUE,
                   points=list(pch=trellis.par.get("superpose.symbol")$pch[1:2],
                               col=trellis.par.get("superpose.symbol")$col[1:2]),
                   text=list(c("1932", "1931"))),
        xlab = "Barley Yield (bushels/acre) ",
        aspect=0.5, layout = c(1,6), ylab=NULL)
stripplot(voice.part ~ jitter(height), data = singer, aspect = 1,
          jitter = TRUE, xlab = "Height (inches)")
## Interaction Plot
data(OrchardSprays)
bwplot(decrease ~ treatment, OrchardSprays, groups = rowpos,
       panel = "panel.superpose",
       panel.groups = "panel.linejoin",
       xlab = "treatment",
       key = list(lines = Rows(trellis.par.get("superpose.line"),
                  c(1:7, 1)), 
                  text = list(lab = as.character(unique(OrchardSprays$rowpos))),
                  columns = 4, title = "Row position"))

[Package Contents]