rapport is shipped with a bunch of (hopefully) handy functions to help you with generating reports and writing templates.
A detailed documentation based on up-to-date Rd
files is also available below. Of course, you could check out the manual of the stable package in PDF format on CRAN too.
This dataset contains data gathered in a survey of Internet usage in Serbian population in the period from April to May 2008. During 90-day period, there were gathered 709 valid responses via on-line distributed questionnaire.
However, this dataset does not contain the original data, as some random noise is added afterwards, in order to demonstrate functionality of rapport helpers.
Dataset variables can be divided into 3 sets: demographic data , Internet usage aspects and application usage/content preference .
Demographic variables
gender - respondent's gender (factor with 2 levels: "male" and "female")
age - respondent's age
dwell - dwelling (factor with 3 levels: "village", "small town" and "city")
student - is respondent a student? (factor with 2 levels: "no" and "yes")
partner - partnership status (factor with 3 levels: "single", "in a relationship" and "married")
Internet usage aspects
Following variables depict various aspects of Internet usage:
edu - time spent on-line in educational purposes (expressed in hours)
leisure - time spent on-line in leisure time (expressed in hours)
net.required - is Internet access required for your profession? (factor with 5 levels: "never", "rarely", "sometimes", "often" and "always")
net.pay - who pays for Internet access? (factor with 5 levels: "parents", "school/faculty", "employer", "self-funded" and "other")
net.use - how long is respondent using Internet? (ordered factor with 7 levels, ranging from "less than 6 months" to "more than 5 years")
Application usage and on-line content preference
These variables include data on the use of Internet applications and content available on the Internet. Practically, they contain responses from a set of 8 questions on a five-point Likert scale.
chatim - usage of chat and/or instant messaging applications
game - usage of on-line games
surf - frequency of web-surfing
email - usage of e-mail applications
download - frequency of file downloading
forum - attendance at web-forums
socnet - usage of social networking services
xxx - traffic to pornographic websites
rapport("example", ius2008, var = "it.leisure")
This function is a wrapper around barchart
which operates only on factors with optional facet.
rp.barplot(x, facet = NULL, data = NULL, groups = FALSE, percent = FALSE, horizontal = TRUE, ...)
x
|
a numeric variable |
facet
|
an optional categorical variable to make facets by |
data
|
an optional data frame from which the variables should be taken |
groups
|
see
|
percent
|
an option to show percentages (100
category) instead of number of cases. Handy with
|
horizontal
|
see
|
...
|
additional parameters to
|
rp.barplot(ius2008$game)
rp.barplot(ius2008$game, horizontal = FALSE)
rp.barplot(ius2008$game, facet = ius2008$gender)
rp.barplot(ius2008$game, facet = ius2008$dwell, horizontal = FALSE, layout = c(1,3))
rp.barplot(ius2008$game, facet = ius2008$gender, groups = TRUE)
with(ius2008, rp.barplot(game, facet = gender))
rp.barplot(gender, data = ius2008)
rp.barplot(dwell, gender, ius2008)
This function is a wrapper around bwplot
which operates only on numeric variables with optional facet.
rp.boxplot(x, y = NULL, facet = NULL, data = NULL, ...)
x
|
a factor variable |
y
|
a numeric variable |
facet
|
an optional categorical variable to make facets by |
data
|
an optional data frame from which the variables should be taken |
...
|
additional parameters to
|
rp.boxplot(ius2008$age)
rp.boxplot(ius2008$age, ius2008$gender)
rp.boxplot(ius2008$age, ius2008$dwell, facet = ius2008$gender)
with(ius2008, rp.scatterplot(age, dwell, facet = gender))
rp.boxplot(age, dwell, data = ius2008)
rp.boxplot(age, dwell, gender, ius2008)
This function is a wrapper around densityplot
which operates only on numeric vectors with optional facet.
rp.densityplot(x, facet = NULL, data = NULL, ...)
x
|
a numeric variable |
facet
|
an optional categorical variable to make facets by |
data
|
an optional data frame from which the variables should be taken |
...
|
additional parameters to
|
rp.densityplot(ius2008$edu)
rp.densityplot(ius2008$edu, facet = ius2008$gender)
rp.densityplot(ius2008$edu, ius2008$dwell)
with(ius2008, rp.densityplot(edu, facet = gender))
rp.densityplot(edu, data = ius2008)
rp.densityplot(edu, gender, ius2008)
This function is a wrapper around dotplot
which operates only on factors with optional facet.
rp.dotplot(x, facet = NULL, data = NULL, groups = FALSE, horizontal = TRUE, ...)
x
|
a factor variable |
facet
|
an optional categorical variable to make facets by |
data
|
an optional data frame from which the variables should be taken |
groups
|
see
|
horizontal
|
see
|
...
|
additional parameters to
|
rp.dotplot(ius2008$game)
rp.dotplot(ius2008$game, horizontal = FALSE)
rp.dotplot(ius2008$game, facet = ius2008$dwell)
rp.dotplot(ius2008$dwell, facet = ius2008$gender, horizontal = FALSE)
rp.dotplot(ius2008$game, facet = ius2008$dwell, groups = TRUE)
with(ius2008, rp.dotplot(gender, facet = dwell))
rp.dotplot(game, data = ius2008)
rp.dotplot(dwell, gender, ius2008)
Internal function used by eg. rp.histogram
.
rp.graph.check(x, facet = NULL, subset = NULL, ...)
x
|
a variable |
facet
|
if facet set |
subset
|
if subset set |
...
|
other parameters |
This function is a wrapper around histogram
which operates only on numeric vectors with optional facet.
rp.hist(x, facet = NULL, data = NULL, kernel.smooth = FALSE, ...)
x
|
a numeric variable |
facet
|
an optional categorical variable to make facets by |
data
|
an optional data frame from which the variables should be taken |
kernel.smooth
|
add kernel density plot? |
...
|
additional parameters to
|
rp.hist(ius2008$edu)
rp.hist(ius2008$edu, facet=ius2008$gender)
rp.hist(ius2008$edu, ius2008$dwell)
rp.hist(ius2008$edu, kernel.smooth=TRUE)
with(ius2008, rp.hist(edu, facet = gender))
rp.hist(edu, data = ius2008)
rp.hist(edu, gender, ius2008)
This function is a wrapper around xyplot
with custom panel. Only numeric variables are accepted with optional facet.
rp.lineplot(x, y, facet = NULL, data = NULL, groups = NULL, ...)
x
|
a numeric variable |
y
|
a numeric variable |
facet
|
an optional categorical variable to make facets by |
groups
|
an optional categorical grouping variable |
data
|
an optional data frame from which the variables should be taken |
...
|
additional parameters to
|
a <- aggregate(wt~gear, mtcars, mean)
rp.lineplot(a$gear, a$wt)
rp.lineplot(gear, wt, data=a)
## lame demo:
rp.lineplot(1:length(mtcars$hp), mtcars$hp, facet=mtcars$cyl)
## advanced usage
rp.lineplot(partner, age, data = rp.desc('age', 'partner', fn = 'mean', data=ius2008)) ## TODO: fix....
rp.lineplot(partner, age, gender, data = rp.desc('age', c('gender', 'partner'), fn = 'mean', data=ius2008))
rp.lineplot(partner, age, groups = gender, data=rp.desc('age', c('gender', 'partner'), fn = 'mean', data = ius2008))
## Did you noticed the nasty axis titles? Why not correct those? :)
df <- rp.desc('age', 'partner', fn = 'mean', data = ius2008)
lapply(names(df), function(x) rp.label(df[, x]) <<- x) # nasty solution!
rp.lineplot(partner, age, data = df)
df <- rp.desc('age', c('gender', 'partner'), fn = 'mean', data = ius2008)
lapply(names(df), function(x) rp.label(df[, x]) <<- x) # nasty solution!
rp.lineplot(partner, age, gender, data = df)
df <- rp.desc('age', c('gender', 'partner'), fn = 'mean', data = ius2008)
lapply(names(df), function(x) rp.label(df[, x]) <<- x) # nasty solution!
rp.lineplot(partner, age, groups = gender, data = df)
This function is a wrapper around qqmath
which operates only on a numeric variable with optional facet.
rp.qqplot(x, dist = qnorm, facet = NULL, data = NULL, ...)
x
|
a numeric variable |
dist
|
a theoretical distribution |
facet
|
an optional categorical variable to make facets by |
data
|
an optional data frame from which the variables should be taken |
...
|
additional parameters to
|
rp.qqplot(ius2008$age)
rp.qqplot(ius2008$age, qunif)
rp.qqplot(ius2008$age, qunif, facet = ius2008$gender)
with(ius2008, rp.qqplot(age))
rp.qqplot(age, data = ius2008)
rp.qqplot(age, facet = gender, data = ius2008)
rp.qqplot(age, qunif, gender, ius2008)
rp.qqplot(ius2008$age, panel = function(x) {panel.qqmath(x); panel.qqmathline(x, distribution = qnorm)} )
This function is a wrapper around xyplot
which operates only on numeric variables with optional facet.
rp.scatterplot(x, y, facet = NULL, data = NULL, ...)
x
|
a numeric variable |
y
|
a numeric variable |
facet
|
an optional categorical variable to make facets by |
data
|
an optional data frame from which the variables should be taken |
...
|
additional parameters to
|
rp.scatterplot(ius2008$edu, ius2008$age)
rp.scatterplot(ius2008$edu, ius2008$age, facet=ius2008$gender)
with(ius2008, rp.scatterplot(edu, age, facet = gender))
rp.scatterplot(edu, age, data=ius2008)
rp.scatterplot(edu, age, gender, ius2008)
Similar to rle
function, this function detects "runs" of adjacent integers, and displays vector of run lengths and list of corresponding integer sequences.
adj.rle(x)
x
|
a numeric vector with |
a list with two elements: vector of run lengths, and another list of values corresponding to generated sequences' lengths.
See original thread for more details http://stackoverflow.com/a/8467446/457898 . Special thanks to Gabor Grothendieck for this one!
This function tests if given variable "appears" to be an integer. To qualify as such, two conditions need to be satisfied: it should be stored as numeric
object, and it should pass regular expression test if it consists only of digits.
alike.integer(x)
x
|
a numeric variable that is to be tested |
a logical value that indicates that tested variable "looks like" integer
Capitalises strings in provided character vector
capitalise(x)
x
|
a character vector to capitalise |
character vector with capitalised string elements
capitalise(c("foo", "bar")) # [1] "Foo" "Bar"
A simple wrapper for cat
function that appends newline to output.
catn(...)
...
|
arguments to be passed to
|
None (invisible
NULL
).
Checks if provided object is a boolean i.e. a length-one logical vector.
is.boolean(x)
x
|
an object to check |
a logical value indicating whether provided object is a boolean
is.boolean(TRUE) # [1] TRUE
# the following will work on most systems, unless you have tweaked global Rprofile
is.boolean(T) # [1] TRUE
is.boolean(1) # [1] FALSE
is.string(c("foo", "bar")) # [1] FALSE
Rails-inspired helper that checks if vector values are "empty", i.e. if it's: NULL
, zero-length, NA
, NaN
, FALSE
, an empty string or 0
. Note that unlike its native R is.<something>
sibling functions, is.empty
is vectorised (hence the "values").
is.empty(x, trim = TRUE, ...)
x
|
an object to check its emptiness |
trim
|
trim whitespace? (
|
...
|
additional arguments for
|
is.empty(NULL) # [1] TRUE
is.empty(c()) # [1] TRUE
is.empty(NA) # [1] TRUE
is.empty(NaN) # [1] TRUE
is.empty("") # [1] TRUE
is.empty(0) # [1] TRUE
is.empty(0.00) # [1] TRUE
is.empty(" ") # [1] TRUE
is.empty("foobar") # [1] FALSE
is.empty(" ", trim = FALSE) # [1] FALSE
# is.empty is vectorised!
all(is.empty(rep("", 10))) # [1] TRUE
all(is.empty(matrix(NA, 10, 10))) # [1] TRUE
Checks if provided object exists but the value of that is NULL.
is.exnull(x)
x
|
an object to check |
a logical value indicating whether provided object exists but the value of that is NULL
is.exnull(1) # [1] FALSE
is.exnull("") # [1] FALSE
is.exnull(NULL) # [1] TRUE
Checks if provided object is a number, i.e. a length-one numeric vector.
is.number(x, integer = FALSE)
x
|
an object to check |
integer
|
logical: check if number is integer |
a logical value indicating whether provided object is a string
is.number(3) # [1] TRUE
is.number(3:4) # [1] FALSE
is.number("3") # [1] FALSE
is.number(NaN) # [1] TRUE
is.number(NA_integer_) # [1] TRUE
Checks if provided object is a string i.e. a length-one character vector.
is.string(x)
x
|
an object to check |
a logical value indicating whether provided object is a string
is.string("foobar") # [1] TRUE
is.string(1) # [1] FALSE
is.string(c("foo", "bar")) # [1] FALSE
Combines warning
with sprintf
thus allowing string interpolated diagnostic messages.
messagef(s, ...)
s
|
a character vector of format strings |
...
|
values to be interpolated |
messagef("%.3f is not larger than %d and/or smaller than %d", pi, 10, 40)
This helper combines stop
function with sprintf
thus allowing string interpolated messages when execution is halted.
stopf(s, ...)
s
|
a character vector of format strings |
...
|
values to be interpolated |
a string containing message that follows execution termination
stopf("%.3f is not larger than %d and/or smaller than %d", pi, 10, 40)
Convert character vector to camelcase - capitalise first letter of each word.
tocamel(x, delim = "[^[:alnum:]]", upper = FALSE, sep = "", ...)
x
|
a character vector to be converted to camelcase |
delim
|
a string containing regular expression word delimiter |
upper
|
a logical value indicating if the first
letter of the first word should be capitalised (defaults
to
|
sep
|
a string to separate words |
...
|
additional arguments to be passed to
|
a character vector with strings put in camelcase
tocamel("foo.bar")
## [1] "fooBar"
tocamel("foo.bar", upper = TRUE)
## [1] "FooBar"
tocamel(c("foobar", "foo.bar", "camel_case", "a.b.c.d"))
## [1] "foobar" "fooBar" "camelCase" "aBCD"
Removes leading and/or trailing space(s) from a character vector. By default, it removes both leading and trailing spaces.
trim.space(x, what = c("both", "leading", "trailing", "none"), space.regex = "[:space:]", ...)
x
|
a character vector which values need whitespace trimming |
what
|
which part of the string should be trimmed.
Defaults to
|
space.regex
|
a character value containing a regex that defines a space character |
...
|
additional arguments for
|
a character vector with (hopefully) trimmed spaces
A simple wrapper for gsub
that replaces all patterns from pattern
argument with ones in replacement
over vector provided in argument x
.
vgsub(pattern, replacement, x, ...)
pattern
|
see eponymous argument for
|
replacement
|
see eponymous argument for
|
x
|
see eponymous argument for
|
...
|
additional arguments for
|
a character vector with string replacements
See original thread for more details http://stackoverflow.com/a/6954308/457898 . Special thanks to user Jean-Robert for this one!
Combines warning
with sprintf
thus allowing string interpolated warnings.
warningf(s, ...)
s
|
a character vector of format strings |
...
|
values to be interpolated |
warningf("%.3f is not larger than %d and/or smaller than %d", pi, 10, 40)
Checks if object has "tabular" structure (not to confuse with table
) - in this particular case, that means matrix
and data.frame
objects only.
is.tabular(x)
x
|
an object to be checked for "tabular" format |
a logical value indicating that provided object has tabular structure
is.tabular(HairEyeColor[, , 1]) # [1] TRUE
is.tabular(mtcars) # [1] TRUE
is.tabular(table(mtcars$cyl)) # [1] FALSE
is.tabular(rnorm(100)) # [1] FALSE
is.tabular(LETTERS) # [1] FALSE
is.tabular(pi) # [1] FALSE
From rapport 's point of view, a variable
is a non- NULL
atomic vector that has no dimension attribute (see dim
for details). This approach bypasses factor
issues with is.vector
, and also eliminates multidimensional vectors, such as matrices and arrays.
is.variable(x)
x
|
an object to be checked for "variable" format |
a logical value indicating that provided object is a "variable"
is.variable(rnorm(100)) # [1] TRUE
is.variable(LETTERS) # [1] TRUE
is.variable(NULL) # [1] FALSE
is.variable(mtcars) # [1] FALSE
is.variable(HairEyeColor[, , 1]) # [1] FALSE
is.variable(list()) # [1] FALSE
Takes multiple character arguments as left and right-hand side arguments of a formula, and concatenates them in a single string.
fml(left, right, join.left = " + ", join.right = " + ")
left
|
a string with left-hand side formula argument |
right
|
a character vector with right-hand side formula arguments |
join.left
|
concatenation string for elements of
character vector specified in
|
join.right
|
concatenation string for elements of
character vector specified in
|
fml("hp", c("am", "cyl")) # "hp ~ am + cyl"
Appends a percent sign to provided numerical value. Rounding is carried out according to value passed in decimals
formal argument (defaults to value specified in panderOptions('digits')
).
pct(x, digits = panderOptions("digits"), type = c("percent", "%", "proportion"), check.value = TRUE)
x
|
a numeric value that is to be rendered to percent |
digits
|
an integer value indicating number of decimal places |
type
|
a character value indicating whether percent or proportion value was provided (partial match is allowed) |
check.value
|
perform a sanity check to see if
provided numeric value is correct (defaults to
|
a character value with formatted percent
Aggregate table of descriptives according to functions provided in fn
argument. This function follows melt/cast approach used in reshape
package. Variable names specified in measure.vars
argument are treated as measure.vars
, while the ones in id.vars
are treated as id.vars
(see melt.data.frame
for details). Other its formal arguments match with corresponding arguments for cast
function. Some post-processing is done after reshaping, in order to get pretty row and column labels.
rp.desc(measure.vars, id.vars = NULL, fn, data = NULL, na.rm = TRUE, margins = NULL, subset = TRUE, fill = NA, add.missing = FALSE, total.name = "Total", varcol.name = "Variable", use.labels = getOption("rp.use.labels"), remove.duplicate = TRUE)
measure.vars
|
either a character vector with
variable names from
|
id.vars
|
same rules apply as in
|
fn
|
a list with functions or a character vector with function names |
data
|
a
|
na.rm
|
a logical value indicating whether
|
margins
|
should margins be included? (see
documentation for eponymous argument in
|
subset
|
a logical vector to subset the data before aggregating |
fill
|
value to replace missing level combinations
(see documentation for eponymous argument in
|
add.missing
|
show missing level combinations |
total.name
|
a character string with name for "grand" margin (defaults to "Total") |
varcol.name
|
character string for column that
contains summarised variables (defaults to
|
use.labels
|
use labels instead of variable names in
table header (handle with care, especially if you have
lengthy labels). Defaults to value specified in
|
remove.duplicate
|
should name/label of the variable
provided in
|
a
data.frame
with aggregated data
rp.desc("cyl", "am", c(mean, sd), mtcars, margins = TRUE)
## c
rp.desc("age", c("gender", "student"), c("Average" = mean, "Deviation" = sd), ius2008, remove.duplicate = FALSE)
Display frequency table with counts, percentage, and cumulatives.
rp.freq(f.vars, data, na.rm = TRUE, include.na = FALSE, drop.unused.levels = FALSE, count = TRUE, pct = TRUE, cumul.count = TRUE, cumul.pct = TRUE, total.name = "Total", reorder = FALSE)
f.vars
|
a character vector with variable names |
data
|
a
|
na.rm
|
should missing values be removed? |
include.na
|
should missing values be included in frequency table? |
drop.unused.levels
|
should empty level combinations be left out |
count
|
show frequencies? |
pct
|
show percentage? |
cumul.count
|
show cumulative frequencies? |
cumul.pct
|
show cumulative percentage? |
total.name
|
a sting containing footer label (defaults to "Total") |
reorder
|
reorder the table based on frequencies? |
a
data.frame
with a frequency table
rp.freq(c("am", "cyl", "vs"), mtcars)
Calculates interquartile range of given variable. See rp.univar
for details.
rp.iqr(...)
...
|
parameters to be passed to
|
a numeric value with variable's interquartile range
Calculates kurtosis of given variable. See rp.univar
for details.
rp.kurtosis(...)
...
|
parameters to be passed to
|
a numeric value with variable's kurtosis
Returns the maximum of all values in a vector by passing {codemax as fn
argument to rp.univar
function.
rp.max(...)
...
|
parameters to be passed to
|
a numeric value with maximum value
Calculates mean of given variable by passing sum
as fn
argument to rp.univar
function.
rp.mean(...)
...
|
parameters to be passed to
|
a numeric value with variable's mean
Calculates median of given variable. See rp.univar
for details.
rp.median(...)
...
|
parameters to be passed to
|
a numeric value with variable's median
Returns the minimum of all values in a vector by passing {codemin as fn
argument to rp.univar
function.
rp.min(...)
...
|
parameters to be passed to
|
a numeric value with minimum value
Returns a number of missing ( NA
) values in a variable. This is a wrapper around rp.univar
function with anonymous function passed to count number of NA
elements in a variable.
rp.missing(...)
...
|
parameters to be passed to
|
a numeric value with number of missing vector elements
Calculates percentage of cases for provided variable and criteria specified in subset
argument. Function accepts numeric, factor and logical variables for x
parameter. If numeric and/or factor is provided, subsetting can be achieved via subset
argument. Depending on value of na.rm
argument, either valid ( na.rm = TRUE
) or all cases ( na.rm = FALSE
) are taken into account. By passing logical variable to x
, a sum of ( TRUE
) elements is calculated instead, and valid percents are used ( NA
are excluded).
rp.percent(x, subset = NULL, na.rm = TRUE, pct = FALSE, ...)
x
|
a numeric variable to be summarised |
subset
|
an expression that evaluates to logical
vector (defaults to
|
na.rm
|
should missing values be |
pct
|
print percent string too? |
...
|
additional arguments for
|
a numeric or string depending on the value of
pct
set.seed(0)
x <- sample(5, 100, replace = TRUE)
rp.percent(x > 2)
Calculates difference between the largest and the smallest value in a vector. See rp.univar
for details.
rp.range(...)
...
|
parameters to be passed to
|
a numeric value with calculated range
Calculates standard deviation of given variable. See rp.univar
for details.
rp.sd(...)
...
|
parameters to be passed to
|
a numeric value with variable's standard deviation
Calculates standard error of mean for given variable. See rp.univar
for details.
rp.se.mean(...)
...
|
parameters to be passed to
|
a numeric value with standard error of mean
Calculates skewness of given variable. See rp.univar
for details.
rp.skewness(...)
...
|
parameters to be passed to
|
a numeric value with variable's skewness
Returns the sum of variable's elements, by passing sum
as fn
argument to rp.univar
function.
rp.sum(...)
...
|
parameters to be passed to
|
a numeric value with sum of vector elements
This function operates only on vectors or their subsets, by calculating a descriptive statistic specified in fn
argument.
rp.univar(x, subset = NULL, fn, na.rm = TRUE, ...)
x
|
a numeric variable to be summarised |
subset
|
an expression that evaluates to logical
vector (defaults to
|
fn
|
a function or a function name to be applied on a variable or it's subset |
na.rm
|
a logical value indicating whether
|
...
|
additional arguments for function specified in
|
a numeric
Returns a number of valid (non- NA
) values in a variable. This is a wrapper around rp.univar
function with length
function passed in fn
argument, but with missing values previously removed. However, it's not possible to cancel NA
omission with this function (doing so will yield error).
rp.valid(...)
...
|
parameters to be passed to
|
a numeric value with number of valid (non-NA) vector elements
Calculates variance of given variable. See rp.univar
for details.
rp.var(...)
...
|
parameters to be passed to
|
a numeric value with variable's variance
This function uses htest.short
, to extract statistic and p-value from htest
-classed object. Main advantage of using htest
is that it's vectorised, and can accept multiple methods.
Default parameters are read from options
:
'rp.use.labels'.
htest(x, ..., use.labels = getOption("rp.use.labels"), use.method.names = TRUE, colnames = c("Method", "Statistic", "p-value"))
x
|
arguments to be passed to function specified in
|
...
|
additional arguments for function specified in
|
use.labels
|
a logical value indicating whether
variable labels should be placed in row names. If set to
|
use.method.names
|
use the string provided in
|
colnames
|
a character string containing column names |
a
data.frame
with applied tests in rows, and their
results (statistic and p-value) in columns
library(nortest)
htest(rnorm(100), shapiro.test)
htest(rnorm(100), lillie.test, ad.test, shapiro.test)
htest(mtcars, lillie.test)
htest(mtcars, lillie.test, ad.test, shapiro.test)
htest
Objects Extract value of statistic and its p-value from htest
object.
htest.short(x)
x
|
|
named numeric vector with the value of statistic and its p-value
htest.short(shapiro.test(rnorm(100)))
Calculates kurtosis coefficient for given variable (see is.variable
), matrix
or a data.frame
.
kurtosis(x, na.rm = FALSE)
x
|
a
|
na.rm
|
should
|
Tenjovic, L. (2000). Statistika u psihologiji - prirucnik. Centar za primenjenu psihologiju.
set.seed(0)
x <- rnorm(100)
kurtosis(x)
kurtosis(matrix(x, 10))
kurtosis(mtcars)
rm(x)
Computes Goodman and Kruskal's lambda for given table.
lambda.test(table, direction = 0)
table
|
a
|
direction
|
numeric value of
|
computed lambda value(s) for row/col of given table
Goodman, L.A., Kruskal, W.H. (1954) Measures of association for cross classifications. Part I. Journal of the American Statistical Association 49 , 732–764
## quick example
x <- data.frame(x = c(5, 4, 3), y = c(9, 8, 7), z = c(7, 11, 22), zz = c(1, 15, 8))
lambda.test(x) # 0.1 and 0.18333
lambda.test(t(x)) # 0.18333 and 0.1
## historical data (see the references above: p. 744)
men.hair.color <- data.frame(
b1 = c(1768, 946, 115),
b2 = c(807, 1387, 438),
b3 = c(189, 746, 288),
b4 = c(47, 53, 16)
)
row.names(men.hair.color) <- paste0('a', 1:3)
lambda.test(men.hair.color)
lambda.test(t(men.hair.color))
## some examples on mtcars
lambda.test(table(mtcars$am, mtcars$gear))
lambda.test(table(mtcars$gear, mtcars$am))
lambda.test(table(mtcars$am, mtcars$gear), 1)
lambda.test(table(mtcars$am, mtcars$gear), 2)
A simple test for outliers. This functions returns all extreme values (if any) found in the specified vector.
rp.outlier(x)
x
|
a numeric vector of values |
vector of outlier values
Credit goes to PaulHurleyuk: http://stackoverflow.com/a/1444548/564164
rp.outlier(mtcars$hp)
rp.outlier(c(rep(1,100), 200))
rp.outlier(c(rep(1,100), 200,201))
Calculates skewness coefficient for given variable (see is.variable
), matrix
or a data.frame
.
skewness(x, na.rm = FALSE)
x
|
a
|
na.rm
|
should
|
Tenjovic, L. (2000). Statistika u psihologiji - prirucnik. Centar za primenjenu psihologiju.
set.seed(0)
x <- rnorm(100)
skewness(x)
skewness(matrix(x, 10))
skewness(mtcars)
rm(x)
Converts template inputs to character vector with YAML strings.
## S3 method for class 'rp.inputs' as.character(x, ...)
x
|
template inputs object |
...
|
ignored |
Converts template metadata to character vector with YAML strings.
## S3 method for class 'rp.meta' as.character(x, ...)
x
|
template metadata object |
...
|
ignored |
Checks the class of an input value.
check.input.value.class(value, class = c("character", "complex", "factor", "integer", "logical", "numeric", "raw"), input.name = NULL)
value
|
input value |
class
|
input class (defaults to
|
input.name
|
input name (used in messages) |
A bit misleading title/function name - it validates input values, according to rules set in general input attributes ( length
) or class-specific ones ( nchar
, nlevels
or limit
).
check.input.value(input, value = NULL, attribute.name = c("length", "nchar", "nlevels", "limit"))
input
|
input item |
value
|
input value, either template-defined, or set by the user |
attribute.name
|
input attributes containing
validation rules (defaults to
|
Checks for warnings and errors in report chunks.
check.report.chunks(rp, what = c("errors", "warnings", "messages"))
rp
|
|
what
|
what fields to check. defaults to all |
Throw error
check.tpl(txt, open.tag = get.tags("header.open"), close.tag = get.tags("header.close"), ...)
txt
|
character vector with template contents |
open.tag
|
opening tag regexp |
close.tag
|
closing tag regexp |
...
|
additional params for tag matching (see
|
Check if template metadata field matches provided format, and return matched value in a list.
extract.meta(x, title, regex, short = NULL, trim.white = TRUE, mandatory = TRUE, default.value = NULL, field.length = 1000, ...)
x
|
a string containing template metadata |
title
|
a string containing metadata field title (can be regex-powered) |
regex
|
a string with regular expression to match field value |
short
|
a string with a short name for given metadata field |
trim.white
|
a logical value indicating whether trailing and leading spaces of the given string should be removed before extraction |
mandatory
|
a logical value indicating required field |
default.value
|
fallback to this value if non-mandatory field is not found/malformed |
field.length
|
maximum number of field characters (defaults to 1000) |
...
|
additional parameters for
|
a list with matched content, or
NULL
if the field
is not required
rapport:::extract.meta("Name: John Smith", "Name", "[[:alpha:]]+( [[:alpha:]]+)?")
## $name
## [1] "John Smith"
rapport:::extract.meta("Name: John", "Name", "[[:alpha:]]+( [[:alpha:]]+)?")
## $name
## [1] "John"
Returns report tag vales (usually regexes): either user-defined, or the default ones.
Default parameters are read from options
:
'header.open',
'header.close',
'comment.open',
'comment.close'.
get.tags(tag.type = c("all", "header.open", "header.close", "comment.open", "comment.close"), preset = c("user", "default"))
tag.type
|
a character value with tag value name |
preset
|
a character value specifying which preset to return |
either a list (default) or a character value with tag regexes
get.tags() # same as 'get.tags("all")'
get.tags("header.open")
Checks and returns input description.
guess.input.description(description)
description
|
a character string containing input description |
Checks and returns input label.
guess.input.label(label)
label
|
a character string containing input label |
From v. 0.51
one or more characters that are not newline should do the trick. Note that white spaces will be trimmed from both ends in resulting string.
guess.input.name(name)
name
|
a character value with input name |
Checks and returns valid input from YAML input definition.
guess.input(input)
input
|
a named list containing input definition |
Guess deprecated input length.
guess.old.input.length(x, input.type)
x
|
a character string containing input length definition |
input.type
|
a character string containing input type |
Checks type of template input, based on provided sting. If input definition is syntactically correct, a list is returned, containing input type, size limits, and default value (for CSV options and boolean types only).
guess.old.input.type(x)
x
|
a character string containing input definition |
Checks if provided string is a valid ATX-style pandoc heading.
is.heading(x)
x
|
a string to test for pandoc heading format |
a logical value indicating the string is (not) a pandoc heading
Checks if provided R object is of rapport
class.
is.rapport(x)
x
|
any R object to check |
a logical value indicating whether provided object is a
rapport
object
Checks if provided R object is a rapport
heading element.
is.rp.heading(x)
x
|
any R object to check |
a logical value indicating whether provided object is a
rp.heading
object
Default print method for rapport
class objects that shows evaluated report contents.
## S3 method for class 'rapport' print(x, ...)
x
|
any "rapport" class object |
...
|
ignored |
rapport('example', data = mtcars, var='hp')
Prints out the contents of template header (both metadata and inputs) in human-readable format, so you can get insight about the template requirements.
## S3 method for class 'rp.info' print(x, ...)
x
|
object of class
|
...
|
ignored |
Prints out the contents of template inputs in human-readable format.
## S3 method for class 'rp.inputs' print(x, ...)
x
|
object of class
|
...
|
ignored |
Prints out the contents of template metadata in human-readable format.
## S3 method for class 'rp.meta' print(x, ...)
x
|
object of class
|
...
|
ignored |
Remove comments from provided character vector.
Default parameters are read from options
:
'comment.open',
'comment.close'.
purge.comments(x, comment.open = get.tags("comment.open"), comment.close = get.tags("comment.close"))
x
|
a character string to remove comments from |
comment.open
|
a string containing opening tag |
comment.close
|
a string containing closing tag |
a string with removed pandoc comments
This is a simple wrapper around rapport
and tpl.export
. Basically it works like rapport
but the returned class is exported at one go.
rapport.docx(...)
...
|
parameters passed directly to
|
This is a simple wrapper around rapport
and tpl.export
. Basically it works like rapport
but the returned class is exported at one go.
rapport.html(...)
...
|
parameters passed directly to
|
This is a simple wrapper around rapport
and tpl.export
. Basically it works like rapport
but the returned class is exported at one go.
rapport.odt(...)
...
|
parameters passed directly to
|
This is a simple wrapper around rapport
and tpl.export
. Basically it works like rapport
but the returned class is exported at one go.
rapport.pdf(...)
...
|
parameters passed directly to
|
This is the central function in the rapport
package, and hence eponymous. In following lines we'll use rapport
to denote the function, not the package. rapport
requires a template file, while dataset ( data
argument) can be optional, depending on the value of Data required
field in template header. Template inputs are matched with ...
argument, and should be provided in x = value
format, where x
matches input name and value
, wait for it... input value! See tpl.inputs
for more details on template inputs.
Default parameters are read from evalsOptions()
and the following options
:
'rp.file.name',
'rp.file.path',
rapport(fp, data = NULL, ..., env = new.env(), reproducible = FALSE, header.levels.offset = 0, graph.output = evalsOptions("graph.output"), file.name = getOption("rp.file.name"), file.path = getOption("rp.file.path"), graph.width = evalsOptions("width"), graph.height = evalsOptions("height"), graph.res = evalsOptions("res"), graph.hi.res = evalsOptions("hi.res"), graph.replay = evalsOptions("graph.recordplot"))
fp
|
a template file pointer (see
|
data
|
a
|
...
|
matches template inputs in format 'key = "value"' |
env
|
an environment where template commands be
evaluated (defaults to
|
reproducible
|
a logical value indicating if the
call and data should be stored in template object, thus
making it reproducible (see
|
header.levels.offset
|
number added to header levels (handy when using nested templates) |
file.name
|
set the file name of saved plots and
exported documents. A simple character string might be
provided where
|
file.path
|
path of a directory where to store generated images and exported reports |
graph.output
|
the required file format of saved plots (optional) |
graph.width
|
the required width of saved plots (optional) |
graph.height
|
the required height of saved plots (optional) |
graph.res
|
the required nominal resolution in ppi of saved plots (optional) |
graph.hi.res
|
logical value indicating if high resolution (1280x~1280) images would be also generated |
graph.replay
|
logical value indicating if plots
need to be recorded for later replay (eg. while
|
a list with
rapport
class.
rapport('Example', ius2008, v = "leisure")
rapport('Descriptives', ius2008, var = "leisure")
## generating high resolution images also
rapport('Example', ius2008, v = "leisure", graph.hi.res = TRUE)
rapport.html('NormalityTest', ius2008, var = "leisure", graph.hi.res=T)
## generating only high resolution image
rapport('Example', ius2008, v = "leisure", graph.width = 1280, graph.height = 1280)
## nested templates cannot get custom setting, use custom rapport option:
options('graph.hi.res' = TRUE)
rapport('AnalyzeWizard.tpl', data=ius2008, variables=c('edu', 'game'))
This function returns character value previously stored in variable's label
attribute. If none found, and fallback
argument is set to TRUE
(default), the function returns object's name (retrieved by deparse(substitute(x))
), otherwise NA
is returned with a warning notice.
rp.label(x, fallback = TRUE, simplify = TRUE)
x
|
an R object to extract labels from |
fallback
|
a logical value indicating if labels should fallback to object name(s) |
simplify
|
coerce results to a vector (
|
a character vector with variable's label(s)
x <- rnorm(100)
rp.label(x) # returns "x"
rp.label(x, FALSE) # returns NA and issues a warning
rp.label(mtcars$hp) <- "Horsepower"
rp.label(mtcars) # returns "Horsepower" instead of "hp"
rp.label(mtcars, FALSE) # returns NA where no labels are found
rp.label(sleep, FALSE) # returns NA for each variable and issues a warning
This function sets a label to a variable, by storing a character string to its label
attribute.
rp.label(var) <- value
var
|
a variable (see
|
value
|
a character value that is to be set as variable label |
rp.label(mtcars$mpg) <- "fuel consumption"
x <- rnorm(100); ( rp.label(x) <- "pseudo-random normal variable" )
This function returns character value previously stored in variable's name
attribute. If none found, the function defaults to object's name.
rp.name(x)
x
|
an R (atomic or data.frame/list) object to extract names from |
a character value with variable's label
rp.name(mtcars$am)
x <- 1:10; rp.name(x)
Returns contents of the template body.
tpl.body(fp, htag = get.tags("header.close"), ...)
fp
|
a template file pointer (see
|
htag
|
a string with closing body tag |
...
|
additional arguments to be passed to
|
a character vector with template body contents
Checks if the examples of given template can be run without any error.
If everything went fine and you get a list of success
equals to TRUE
values, otherwise success
returns FALSE
with additional message
tpl.check(fp)
fp
|
a character vector containing template name (".tpl" extension is optional), file path or a text to be split by line breaks |
tpl.check('example')
Displays template examples defined in Example
section. Handy to check out what template does and how does it look like once it's rendered. If multiple examples are available, and index
argument is NULL
, you will be prompted for input. If only one example is available in the header, user is not prompted for input action, and given template is evaluated automatically. At any time you can provide an integer vector with example indices to index
argument, and specified examples will be evaluated without prompting, thus returning a list of rapport
objects. Example output can be easily exported to various formats (HTML, ODT, etc.) - check out documentation for tpl.export
for more info.
tpl.example(fp, index = NULL, env = .GlobalEnv)
fp
|
a template file pointer (see
|
index
|
a numeric vector indicating the example index - meaningful only for templates with multiple examples. Accepts vector of integers to match IDs of template example. Using 'all' (character string) as index will return all examples. |
env
|
an environment where example will be evaluated
(defaults to
|
tpl.example('Example')
tpl.example('Example', 1:2)
tpl.example('Example', 'all')
tpl.example('Crosstable')
tpl.export(tpl.example('Crosstable'))
This function exports rapport class objects to various formats based on ascii package.
By default this function tries to export the report to HTML with pandoc. Some default styles are applied. If you do not like those default settings, use your own options
.
Default parameters are read from options
:
'tpl.user'
Please be sure to set 'tpl.user'
option with options()
to get your name in the head of your generated reports!
tpl.export(rp = NULL, file, append = FALSE, create = TRUE, open = TRUE, date = pander.return(Sys.time()), description = TRUE, format = "html", options = "", logo = TRUE)
rp
|
a rapport class object or list of rapport class objects |
file
|
filename of the generated document. Inherited
from rapport class if not set. If
|
append
|
FALSE (new report created) or an R object (class of "Report") to which the new report will be added |
create
|
should export really happen? It might be handy if you want to append several reports. |
open
|
open the exported document? Default set to TRUE. |
date
|
character string as the date field of the report. If not set, current time will be set. |
description
|
add
|
format
|
format of the wanted report. See Pandoc's
user manual for details. In short, choose something like:
|
options
|
options passed to
|
logo
|
add rapport logo |
filepath on
create = TRUE
,
Report
class
otherwise
John MacFarlane (2012): _Pandoc User's Guide_. http://johnmacfarlane.net/pandoc/README.html
## eval some template
x <- rapport('example', data = mtcars, var="hp")
## try basic parameters
tpl.export(x)
tpl.export(x, file='demo')
tpl.export(x, file='demo', format='odt')
### append reports
# 1) Create a report object with the first report and do not export (optional)
report <- tpl.export(x, create=F)
# 2) Append some other reports without exporting (optional)
report <- tpl.export(x, create=F, append=report)
# 3) Export it!
tpl.export(append=report)
# 4) Export it to other formats too! (optional)
tpl.export(append=report, format='rst')
### exporting multiple reports at once
tpl.export(tpl.example('example', 'all'))
tpl.export(tpl.example('example', 'all'), format='odt')
tpl.export(list(rapport('univar-descriptive', data = mtcars, var="hp"),
rapport('univar-descriptive', data = mtcars, var="mpg")))
### Never do this as being dumb:
tpl.export()
### Using other backends
## asciidoc
tpl.export(tpl.example('example', 'all'), backend='asciidoc')
## txt2tags
tpl.export(tpl.example('example', 'all'), backend='t2t')
### Adding own custom CSS to exported HTML
tpl.export(x, options=sprintf('-c %s', system.file('templates/css/default.css', package='rapport')))
Reads file either from template name in system folder, file path or remote URL, and splits it into lines for easier handling by rapport internal parser. "find" in tpl.find
is borrowed from Emacs parlance - this function actually reads the template.
tpl.find(fp, ...)
fp
|
a character string containing a template path,
a template name (for package-bundled templates only),
template contents separated by newline (
|
...
|
additional params for header tag matching (see
|
a character vector with template contents
Returns rapport
template header from provided path or a character vector.
tpl.header(fp, open.tag = get.tags("header.open"), close.tag = get.tags("header.close"), ...)
fp
|
a template file pointer (see
|
open.tag
|
a string with opening tag (defaults to
value of user-defined
|
close.tag
|
a string with closing tag (defaults to
value of user-defined
|
...
|
additional arguments to be passed to
|
a character vector with template header contents
Provides information about template metadata and/or inputs. See tpl.meta
and tpl.inputs
for details.
tpl.info(fp, meta = TRUE, inputs = TRUE)
fp
|
a template file pointer (see
|
meta
|
return template metadata? (defaults to
|
inputs
|
return template inputs? (defaults to
|
tpl.info('Example') # return both metadata and inputs
tpl.info('Crosstable', inputs = FALSE) # return only template metadata
tpl.info('Correlation', meta = FALSE) # return only template inputs
Displays summary for template inputs (if any). Note that as of version 0.5
, rapport
template inputs should be defined using YAML syntax. See deprecated-inputs
for details on old input syntax. The following sections describe new YAML input definition style.
Introduction
The full power of rapport
comes into play with template inputs . One can match inputs against dataset variables or custom R
objects. The inputs provide means of assigning R
objects to symbol
s in the template evaluation environment. Inputs themselves do not handle only the template names, but also provide an extensive set of rules that each dataset variable/user-provided R
object has to satisfy. The new YAML input specification takes advantage of R
class system. The input attributes should resemble common R
object attributes and methods.
Inputs can be divided into two categories:
dataset inputs , i.e. the inputs that refer to named element of an |codeR object provided in data
argument in rapport
call. Currently, rapport
supports only data.frame
objects, but that may change in the (near) future.
standalone inputs - the inputs that do not depend on the dataset. The user can just provide an R
object of an appropriate class (and other input attributes) to match a standalone input.
General input attributes
Following attributes are available for all inputs:
name
(character string, required) - input name. It acts as an identifier for a given input, and is required as such. Template cannot contain duplicate names. rapport
inputs currently have custom naming conventions - see guess.input.name
for details.
label
(character string) - input label. It can be blank, but it's useful to provide input label as rapport
helpers use that information in plot labels and/or exported HTML tables. Defaults to empty string.
description
(character string) - similar to label
, but should contain long description of given input.
class
(character string) - defines an input class. Currently supported input classes are: character
, complex
, factor
, integer
, logical
, numeric
and raw
(all atomic vector classes are supported). Class attribute should usually be provided, but it can also be NULL
(default) - in that case the input class will be guessed based on matched R
object's value.
required
(logical value) - does the input require a value? Defaults to FALSE
.
standalone
(logical value) - indicates that the input depends on a dataset. Defaults to FALSE
.
length
(either an integer value or a named list with integer values) - provides a set of rules for input value's length. length
attribute can be defined via:
an integer value, e.g. length: 10
, which sets restriction to exactly 10 vectors or values.
named list with min
and/or max
attributes nested under length
attribute. This will define a range of values in which input length must must fall. Note that range limits are inclusive. Either min
or max
attribute can be omitted, and they will default to 1
and Inf
, respectively.
IMPORTANT! Note that rapport
treats input length in a bit different manner. If you match a subset of 10 character vectors from the dataset, input length will be 10
, as you might expect. But if you select only one variable, length will be equal to 1
, and not to the number of vector elements. This stands both for standalone and dataset inputs. However, if you match a character vector against a standalone input, length will be stored correctly - as the number of vector elements.
value
(a vector of an appropriate class). This attribute only exists for standalone inputs. Provided value must satisfy rules defined in class
and length
attributes, as well as any other class-specific rules (see below).
Class-specific attributes
character
nchar
- restricts the number of characters of the input value. It accepts the same attribute format as length
. If NULL
(default), no checks will be performed.
regexp
(character string) - contains a string with regular expression. If non- NULL
, all strings in a character vector must match the given regular expression. Defaults to NULL
- no checks are applied.
matchable
(logical value) - if TRUE
, options
attribute must be provided, while value
is optional, though recommended. options
should contain values to be chosen from, just like <option>
tag does when nested in <select>
HTML tag, while value
must contain a value from options
or it can be omitted ( NULL
). allow_multiple
will allow values from options
list to be matched multiple times. Note that unlike previous versions of rapport
, partial matching is not performed.
numeric , integer
limit
- similar to length
attribute, but allows only min
and max
nested attributes. Unlike length
attribute, limit
checks input values rather than input length. limit
attribute is NULL
by default and the checks are performed only when limit
is defined (non- NULL
).
factor
nlevels
- accepts the same format as length
attribute, but the check is performed rather on the number of factor levels.
matchable
- ibid as in character inputs (note that in previous versions of rapport
matching was performed against factor levels - well, not any more, now we match against values to make it consistent with character
inputs).
tpl.inputs(fp, use.header = FALSE)
fp
|
a template file pointer (see
|
use.header
|
a logical value indicating whether the
header section is provided in
|
Lists all templates bundled with current package build. By default, it will search for all .tpl
files in current directory, path specified in tpl.paths
option and package library path.
tpl.list(...)
...
|
additional parameters for
|
a character vector with template files
Displays summary of template metadata stored in a header section. This part of template header consists of several YAML key: value
pairs, which contain some basic information about the template, just much like the DESCRIPTION
file in R
packages does.
Current implementation supports following fields:
title
- a template title (required)
author
- author's (nick)name (required)
description
- template description (required)
email
- author's email address
packages
- YAML list of packages required by the template (if any)
example
- example calls to rapport
function, including template data and inputs
As of version 0.5
, dataRequired
field is deprecated. rapport
function will automatically detect if the template requires a dataset based on the presence of standalone inputs.
tpl.meta(fp, fields = NULL, use.header = FALSE, trim.white = TRUE)
fp
|
a template file pointer (see
|
fields
|
a list of named lists containing key-value pairs of field titles and corresponding regexes |
use.header
|
a logical value indicating if the
character vector provided in
|
trim.white
|
a logical value indicating if the extra spaces should removed from header fields before extraction |
a named list with template metadata
Adds a new element to custom paths' list where rapport will look for templates.
tpl.paths.add(...)
...
|
character vector of paths |
TRUE on success (invisibly)
tpl.paths.add('/tmp')
tpl.list()
## might trigger an error:
tpl.paths.add('/home', '/rapport')
List all custom paths where rapport will look for templates.
tpl.paths()
a character vector with paths
tpl.paths()
Removes an element from custom paths' list where rapport will look for templates.
tpl.paths.remove(...)
...
|
character vector of paths |
TRUE on success (invisibly)
tpl.paths()
tpl.paths.add('/tmp')
tpl.paths()
tpl.paths.remove('/tmp')
tpl.paths()
## might trigger an error:
tpl.paths.remove('/root')
Resets to default (NULL) all custom paths where rapport will look for templates.
tpl.paths.reset()
tpl.paths.reset()
Convert old-style template to new-style one (what we really do is just replacing old header syntax with YAML one).
tpl.renew(fp, file = NULL)
fp
|
pointer to an old template (see
|
file
|
a path to output file. If
|
Runs template with data and arguments included in rapport
object. In order to get reproducible example, you have to make sure that reproducible
argument is set to TRUE
in rapport
function.
tpl.rerun(tpl)
tpl
|
a
|
tmp <- rapport("Example", mtcars, v = "hp", reproducible = TRUE)
tpl.rerun(tmp)
rapport
's alternative to Stangle
- extracts contents of template chunks. If file
argument
tpl.tangle(fp, file = "", show.inline.chunks = FALSE)
fp
|
template file pointer (see
|
file
|
see
|
show.inline.chunks
|
extract contents of inline
chunks as well? (defaults to
|
(invisibly) a list with either inline or block chunk contents