Title: | Convenience Functions for Psychology |
---|---|
Description: | Make your workflow faster and easier. Easily customizable plots (via 'ggplot2'), nice APA tables (following the style of the *American Psychological Association*) exportable to Word (via 'flextable'), easily run statistical tests or check assumptions, and automatize various other tasks. |
Authors: | Rémi Thériault [aut, cre] |
Maintainer: | Rémi Thériault <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.1.8.2 |
Built: | 2024-09-26 14:14:36 UTC |
Source: | https://github.com/rempsyc/rempsyc |
Chooses the best duplicate, based on the duplicate with the smallest number of missing values. In case of ties, it picks the first duplicate, as it is the one most likely to be valid and authentic, given practice effects.
best_duplicate(data, id, keep.rows = FALSE)
best_duplicate(data, id, keep.rows = FALSE)
data |
The data frame. |
id |
The ID variable for which to check for duplicates. |
keep.rows |
Logical, whether to add a column at the beginning of the data frame with the original row indices. |
For the easystats equivalent, see:
datawizard::data_duplicated()
.
A dataframe, containing only the "best" duplicates.
df1 <- data.frame( id = c(1, 2, 3, 1, 3), item1 = c(NA, 1, 1, 2, 3), item2 = c(NA, 1, 1, 2, 3), item3 = c(NA, 1, 1, 2, 3) ) best_duplicate(df1, id = "id", keep.rows = TRUE)
df1 <- data.frame( id = c(1, 2, 3, 1, 3), item1 = c(NA, 1, 1, 2, 3), item2 = c(NA, 1, 1, 2, 3), item3 = c(NA, 1, 1, 2, 3) ) best_duplicate(df1, id = "id", keep.rows = TRUE)
Easily output a correlation matrix and export it to Microsoft Excel, with the first row and column frozen, and correlation coefficients colour-coded based on effect size (0.0-0.2: small (no colour); 0.2-0.4: medium (pink/light blue); 0.4-1.0: large (red/dark blue)), following Cohen's suggestions for small (.10), medium (.30), and large (.50) correlation sizes.
Based on the correlation
and openxlsx2
packages.
cormatrix_excel( data, filename, overwrite = TRUE, p_adjust = "none", print.mat = TRUE, ... )
cormatrix_excel( data, filename, overwrite = TRUE, p_adjust = "none", print.mat = TRUE, ... )
data |
The data frame |
filename |
Desired filename (path can be added before hand but no need to specify extension). |
overwrite |
Whether to allow overwriting previous file. |
p_adjust |
Default p-value adjustment method (default is "none",
although |
print.mat |
Logical, whether to also print the correlation matrix to console. |
... |
Parameters to be passed to the |
A Microsoft Excel document, containing the colour-coded correlation matrix with significance stars, on the first sheet, and the colour-coded p-values on the second sheet.
Adapted from @JanMarvin (JanMarvin/openxlsx2#286) and
the original rempsyc::cormatrix_excel
.
# Basic example cormatrix_excel(mtcars, select = c("mpg", "cyl", "disp", "hp", "carb"), filename = "cormatrix1") cormatrix_excel(iris, p_adjust = "none", filename = "cormatrix2") cormatrix_excel(airquality, method = "spearman", filename = "cormatrix3")
# Basic example cormatrix_excel(mtcars, select = c("mpg", "cyl", "disp", "hp", "carb"), filename = "cormatrix1") cormatrix_excel(iris, p_adjust = "none", filename = "cormatrix2") cormatrix_excel(airquality, method = "spearman", filename = "cormatrix3")
Extract all duplicates, for visual inspection.
Note that it also contains the first occurrence of future
duplicates, unlike duplicated()
or dplyr::distinct()
). Also
contains an additional column reporting the number of missing
values for that row, to help in the decision-making when
selecting which duplicates to keep.
extract_duplicates(data, id)
extract_duplicates(data, id)
data |
The data frame. |
id |
The ID variable for which to check for duplicates. |
For the easystats equivalent, see:
datawizard::data_unique()
.
A dataframe, containing all duplicates.
df1 <- data.frame( id = c(1, 2, 3, 1, 3), item1 = c(NA, 1, 1, 2, 3), item2 = c(NA, 1, 1, 2, 3), item3 = c(NA, 1, 1, 2, 3) ) extract_duplicates(df1, id = "id") # Filter to exclude duplicates df2 <- df1[-c(1, 5), ] df2
df1 <- data.frame( id = c(1, 2, 3, 1, 3), item1 = c(NA, 1, 1, 2, 3), item2 = c(NA, 1, 1, 2, 3), item3 = c(NA, 1, 1, 2, 3) ) extract_duplicates(df1, id = "id") # Filter to exclude duplicates df2 <- df1[-c(1, 5), ] df2
Identify outliers based on 3 median absolute deviations (MAD) from the median.
find_mad(data, col.list, ID = NULL, criteria = 3, mad.scores = TRUE)
find_mad(data, col.list, ID = NULL, criteria = 3, mad.scores = TRUE)
data |
The data frame. |
col.list |
List of variables to check for outliers. |
ID |
ID variable if you would like the outliers to be identified as such. |
criteria |
How many MAD to use as threshold (similar to standard deviations) |
mad.scores |
Logical, whether to output robust z (MAD) scores (default)
or raw scores. Defaults to |
The function internally use scale_mad()
to "standardize" the data
based on the MAD and median, and then check for any observation greater
than the specified criteria (e.g., +/-3).
For the easystats equivalent, use:
performance::check_outliers(x, method = "zscore_robust, threshold = 3)
.
A list of dataframes of outliers per variable, with row
numbers, based on the MAD. When printed, provides the number
of outliers, selected variables, and any outlier flagged for
more than one variable. More information can be obtainned
by using the attributes()
function around the generated object.
Leys, C., Ley, C., Klein, O., Bernard, P., & Licata, L. (2013). Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median. Journal of Experimental Social Psychology, 49(4), 764–766. https://doi.org/10.1016/j.jesp.2013.03.013
find_mad( data = mtcars, col.list = names(mtcars), criteria = 3 ) mtcars2 <- mtcars mtcars2$car <- row.names(mtcars) find_mad( data = mtcars2, col.list = names(mtcars), ID = "car", criteria = 3 )
find_mad( data = mtcars, col.list = names(mtcars), criteria = 3 ) mtcars2 <- mtcars mtcars2$car <- row.names(mtcars) find_mad( data = mtcars2, col.list = names(mtcars), ID = "car", criteria = 3 )
Easily format p or r values. Note: converts to character class for use in figures or manuscripts to accommodate e.g., "< .001".
format_value(value, type = "d", ...) format_p( p, precision = 0.001, prefix = NULL, suffix = NULL, sign = FALSE, stars = FALSE ) format_r(r, precision = 0.01) format_d(d, precision = 0.01)
format_value(value, type = "d", ...) format_p( p, precision = 0.001, prefix = NULL, suffix = NULL, sign = FALSE, stars = FALSE ) format_r(r, precision = 0.01) format_d(d, precision = 0.01)
value |
Value to be formatted, when using the generic |
type |
Specify r or p value. |
... |
To specify precision level, if necessary, when using the
generic |
p |
p value to format. |
precision |
Level of precision desired, if necessary. |
prefix |
To add a prefix before the value. |
suffix |
To add a suffix after the value. |
sign |
Logical. Whether to add an equal sign for p values higher or equal to .001. |
stars |
Logical. Whether to add asterisks for significant p values. |
r |
r value to format. |
d |
d value to format. |
For the easystats equivalent, see:
insight::format_value()
.
A formatted p, r, or d value.
format_value(0.00041231, "p") format_value(0.00041231, "r") format_value(1.341231, "d") format_p(0.0041231) format_p(0.00041231) format_r(0.41231) format_r(0.041231) format_d(1.341231) format_d(0.341231)
format_value(0.00041231, "p") format_value(0.00041231, "r") format_value(1.341231, "d") format_p(0.0041231) format_p(0.00041231) format_r(0.41231) format_r(0.041231) format_d(1.341231) format_d(0.341231)
Get required version of specified package dependency
get_dep_version(dep, pkg = utils::packageName())
get_dep_version(dep, pkg = utils::packageName())
dep |
Dependency of the specified package to check |
pkg |
Package to check the dependency from |
Make nice grouped bar charts easily.
grouped_bar_chart( data, response, label = response, group = "T1_Group", proportion = TRUE, print_table = FALSE )
grouped_bar_chart( data, response, label = response, group = "T1_Group", proportion = TRUE, print_table = FALSE )
data |
The data frame. |
response |
The categorical dependent variable to be plotted. |
label |
Label of legend describing the dependent variable. |
group |
The group by which to plot the variable |
proportion |
Logical, whether to use proportion (default), else, counts. |
print_table |
Logical, whether to also print the computed proportion or count table. |
A bar plot of class ggplot.
# Make the basic plot iris2 <- iris iris2$plant <- c( rep("yes", 45), rep("no", 45), rep("maybe", 30), rep("NA", 30) ) grouped_bar_chart( data = iris2, response = "plant", group = "Species" )
# Make the basic plot iris2 <- iris iris2$plant <- c( rep("yes", 45), rep("no", 45), rep("maybe", 30), rep("NA", 30) ) grouped_bar_chart( data = iris2, response = "plant", group = "Species" )
Install package if not already installed
install_if_not_installed(pkgs)
install_if_not_installed(pkgs)
pkgs |
Packages to install if not already installed |
Test linear regression assumptions easily with a nice summary table.
nice_assumptions(model)
nice_assumptions(model)
model |
The |
Interpretation: (p) values < .05 imply assumptions are not respected. Diagnostic is how many assumptions are not respected for a given model or variable.
A dataframe, with p-value results for the Shapiro-Wilk, Breusch-Pagan, and Durbin-Watson tests, as well as a diagnostic column reporting how many assumptions are not respected for a given model. Shapiro-Wilk is set to NA if n < 3 or n > 5000.
Other functions useful in assumption testing:
nice_density
, nice_normality
,
nice_qq
, nice_varplot
,
nice_var
. Tutorial:
https://rempsyc.remi-theriault.com/articles/assumptions
# Create a regression model (using data available in R by default) model <- lm(mpg ~ wt * cyl + gear, data = mtcars) nice_assumptions(model) # Multiple dependent variables at once model2 <- lm(qsec ~ disp + drat * carb, mtcars) my.models <- list(model, model2) nice_assumptions(my.models)
# Create a regression model (using data available in R by default) model <- lm(mpg ~ wt * cyl + gear, data = mtcars) nice_assumptions(model) # Multiple dependent variables at once model2 <- lm(qsec ~ disp + drat * carb, mtcars) my.models <- list(model, model2) nice_assumptions(my.models)
Easily compute planned contrast analyses (pairwise comparisons similar to t-tests but more powerful when more than 2 groups), and format in publication-ready format. In this particular case, the confidence intervals are bootstraped on chosen effect size (default to Cohen's d).
nice_contrasts( response, group, covariates = NULL, data, effect.type = "cohens.d", bootstraps = 2000, ... )
nice_contrasts( response, group, covariates = NULL, data, effect.type = "cohens.d", bootstraps = 2000, ... )
response |
The dependent variable. |
group |
The group for the comparison. |
covariates |
The desired covariates in the model. |
data |
The data frame. |
effect.type |
What effect size type to use. One of "cohens.d" (default), "akp.robust.d", "unstandardized", "hedges.g", "cohens.d.sigma", or "r". |
bootstraps |
The number of bootstraps to use for the confidence interval |
... |
Arguments passed to bootES::bootES. |
Statistical power is lower with the standard t test compared than it is with the planned contrast version for two reasons: a) the sample size is smaller with the t test, because only the cases in the two groups are selected; and b) in the planned contrast the error term is smaller than it is with the standard t test because it is based on all the cases (source).
The effect size and confidence interval are calculated via
bootES::bootES, and correct for contrasts but not for covariates and
other predictors. Because this method uses bootstrapping, it is recommended
to set a seed before using for reproducibility reasons (e.g.,
sed.seet(100)
).
Does not for the moment support nested comparisons for marginal means,
only a comparison of all groups. For nested comparisons, please use
emmeans::contrast()
directly, or for the easystats equivalent,
modelbased::estimate_contrasts()
.
When using nice_lm_contrasts()
, please use as.factor()
outside the
lm()
formula, or it will lead to an error.
A dataframe, with the selected dependent variable(s), comparisons of interest, degrees of freedom, t-values, p-values, Cohen's d, and the lower and upper 95% confidence intervals of the effect size (i.e., dR).
nice_lm_contrasts
,
Tutorial: https://rempsyc.remi-theriault.com/articles/contrasts
# Basic example set.seed(100) nice_contrasts( data = mtcars, response = "mpg", group = "cyl", bootstraps = 200 ) set.seed(100) nice_contrasts( data = mtcars, response = "disp", group = "gear" ) # Multiple dependent variables set.seed(100) nice_contrasts( data = mtcars, response = c("mpg", "disp", "hp"), group = "cyl" ) # Adding covariates set.seed(100) nice_contrasts( data = mtcars, response = "mpg", group = "cyl", covariates = c("disp", "hp") ) # Now supports more than 3 levels mtcars2 <- mtcars mtcars2$carb <- as.factor(mtcars2$carb) set.seed(100) nice_contrasts( data = mtcars, response = "mpg", group = "carb", bootstraps = 200 )
# Basic example set.seed(100) nice_contrasts( data = mtcars, response = "mpg", group = "cyl", bootstraps = 200 ) set.seed(100) nice_contrasts( data = mtcars, response = "disp", group = "gear" ) # Multiple dependent variables set.seed(100) nice_contrasts( data = mtcars, response = c("mpg", "disp", "hp"), group = "cyl" ) # Adding covariates set.seed(100) nice_contrasts( data = mtcars, response = "mpg", group = "cyl", covariates = c("disp", "hp") ) # Now supports more than 3 levels mtcars2 <- mtcars mtcars2$carb <- as.factor(mtcars2$carb) set.seed(100) nice_contrasts( data = mtcars, response = "mpg", group = "carb", bootstraps = 200 )
Make nice density plots easily. Internally, uses na.rm = TRUE
.
nice_density( data, variable, group = NULL, colours, ytitle = "Density", xtitle = variable, groups.labels = NULL, grid = TRUE, shapiro = FALSE, title = variable, histogram = FALSE, breaks.auto = FALSE, bins = 30 )
nice_density( data, variable, group = NULL, colours, ytitle = "Density", xtitle = variable, groups.labels = NULL, grid = TRUE, shapiro = FALSE, title = variable, histogram = FALSE, breaks.auto = FALSE, bins = 30 )
data |
The data frame |
variable |
The dependent variable to be plotted. |
group |
The group by which to plot the variable. |
colours |
Desired colours for the plot, if desired. |
ytitle |
An optional y-axis label, if desired. |
xtitle |
An optional x-axis label, if desired. |
groups.labels |
The groups.labels (might rename to
|
grid |
Logical, whether to keep the default background grid or not. APA style suggests not using a grid in the background, though in this case some may find it useful to more easily estimate the slopes of the different groups. |
shapiro |
Logical, whether to include the p-value from the Shapiro-Wilk test on the plot. |
title |
The desired title of the plot. Can be put to |
histogram |
Logical, whether to add an histogram |
breaks.auto |
If histogram = TRUE, then option to set bins/breaks
automatically, mimicking the default behaviour of base
R |
bins |
If |
A density plot of class ggplot
, by group (if provided), along a
reference line representing a matched normal distribution.
Other functions useful in assumption testing:
nice_assumptions
, nice_normality
,
nice_qq
, nice_varplot
,
nice_var
. Tutorial:
https://rempsyc.remi-theriault.com/articles/assumptions
# Make the basic plot nice_density( data = iris, variable = "Sepal.Length", group = "Species" ) # Further customization nice_density( data = iris, variable = "Sepal.Length", group = "Species", colours = c("#00BA38", "#619CFF", "#F8766D"), xtitle = "Sepal Length", ytitle = "Density (vs. Normal Distribution)", groups.labels = c( "(a) Setosa", "(b) Versicolor", "(c) Virginica" ), grid = FALSE, shapiro = TRUE, title = "Density (Sepal Length)", histogram = TRUE )
# Make the basic plot nice_density( data = iris, variable = "Sepal.Length", group = "Species" ) # Further customization nice_density( data = iris, variable = "Sepal.Length", group = "Species", colours = c("#00BA38", "#619CFF", "#F8766D"), xtitle = "Sepal Length", ytitle = "Density (vs. Normal Distribution)", groups.labels = c( "(a) Setosa", "(b) Versicolor", "(c) Virginica" ), grid = FALSE, shapiro = TRUE, title = "Density (Sepal Length)", histogram = TRUE )
Formats output of lm()
model object for a
publication-ready format.
nice_lm( model, b.label = "b", standardize = FALSE, mod.id = TRUE, ci.alternative = "two.sided", ... )
nice_lm( model, b.label = "b", standardize = FALSE, mod.id = TRUE, ci.alternative = "two.sided", ... )
model |
The model to be formatted. |
b.label |
What to rename the default "b" column (e.g.,
to capital B if using standardized data for it to be converted
to the Greek beta symbol in the nice_table function). Now
attempts to automatically detect whether the variables were
standardized, and if so, sets |
standardize |
Logical, whether to standardize the
data before refitting the model. If |
mod.id |
Logical. Whether to display the model number, when there is more than one model. |
ci.alternative |
Alternative for the confidence interval of the sr2. It can be either "two.sided (the default in this package), "greater", or "less". |
... |
Further arguments to be passed to the effectsize::r2_semipartial function for the effect size. |
The effect size, sr2 (semi-partial correlation squared, also
known as delta R2), is computed through effectsize::r2_semipartial.
Please read the documentation for that function, especially regarding
the interpretation of the confidence interval. In rempsyc
, instead
of using the default one-sided alternative ("greater"), we use the
two-sided alternative.
To interpret the sr2, use effectsize::interpret_r2_semipartial()
.
For the easystats equivalent, use report::report()
on the lm()
model object.
A formatted dataframe of the specified lm model, with DV, IV, degrees of freedom, regression coefficient, t-value, p-value, and the effect size, the semi-partial correlation squared, and its confidence interval.
Checking simple slopes after testing for moderation:
nice_lm_slopes
, nice_mod
,
nice_slopes
. Tutorial:
https://rempsyc.remi-theriault.com/articles/moderation
# Make and format model model <- lm(mpg ~ cyl + wt * hp, mtcars) nice_lm(model) # Make and format multiple models model2 <- lm(qsec ~ disp + drat * carb, mtcars) my.models <- list(model, model2) x <- nice_lm(my.models) x # Get interpretations cbind(x, Interpretation = effectsize::interpret_r2_semipartial(x$sr2))
# Make and format model model <- lm(mpg ~ cyl + wt * hp, mtcars) nice_lm(model) # Make and format multiple models model2 <- lm(qsec ~ disp + drat * carb, mtcars) my.models <- list(model, model2) x <- nice_lm(my.models) x # Get interpretations cbind(x, Interpretation = effectsize::interpret_r2_semipartial(x$sr2))
Easily compute planned contrast analyses (pairwise comparisons similar to t-tests but more powerful when more than 2 groups), and format in publication-ready format. In this particular case, the confidence intervals are bootstraped on chosen effect size (default to Cohen's d).
nice_lm_contrasts( model, group, data, p_adjust = "none", effect.type = "cohens.d", bootstraps = 2000, ... )
nice_lm_contrasts( model, group, data, p_adjust = "none", effect.type = "cohens.d", bootstraps = 2000, ... )
model |
The model to be formatted. |
group |
The group for the comparison. |
data |
The data frame. |
p_adjust |
Character: adjustment method (e.g., "bonferroni") – added to options |
effect.type |
What effect size type to use. One of "cohens.d" (default), "akp.robust.d", "unstandardized", "hedges.g", "cohens.d.sigma", or "r". |
bootstraps |
The number of bootstraps to use for the confidence interval |
... |
Arguments passed to bootES::bootES. |
Statistical power is lower with the standard t test compared than it is with the planned contrast version for two reasons: a) the sample size is smaller with the t test, because only the cases in the two groups are selected; and b) in the planned contrast the error term is smaller than it is with the standard t test because it is based on all the cases (source).
The effect size and confidence interval are calculated via
bootES::bootES, and correct for contrasts but not for covariates and
other predictors. Because this method uses bootstrapping, it is recommended
to set a seed before using for reproducibility reasons (e.g.,
sed.seet(100)
).
Does not for the moment support nested comparisons for marginal means,
only a comparison of all groups. For nested comparisons, please use
emmeans::contrast()
directly, or for the easystats equivalent,
modelbased::estimate_contrasts()
.
When using nice_lm_contrasts()
, please use as.factor()
outside the
lm()
formula, or it will lead to an error.
A dataframe, with the selected dependent variable(s), comparisons of interest, degrees of freedom, t-values, p-values, Cohen's d, and the lower and upper 95% confidence intervals of the effect size (i.e., dR).
nice_contrasts
,
Tutorial: https://rempsyc.remi-theriault.com/articles/contrasts
# Make and format model (group need to be a factor) mtcars2 <- mtcars mtcars2$cyl <- as.factor(mtcars2$cyl) model <- lm(mpg ~ cyl + wt * hp, mtcars2) set.seed(100) nice_lm_contrasts(model, group = "cyl", data = mtcars, bootstraps = 500) # Several models at once mtcars2$gear <- as.factor(mtcars2$gear) model2 <- lm(qsec ~ cyl, data = mtcars2) my.models <- list(model, model2) set.seed(100) nice_lm_contrasts(my.models, group = "cyl", data = mtcars, bootstraps = 500) # Now supports more than 3 levels mtcars2$carb <- as.factor(mtcars2$carb) model <- lm(mpg ~ carb + wt * hp, mtcars2) set.seed(100) nice_lm_contrasts(model, group = "carb", data = mtcars2, bootstraps = 500)
# Make and format model (group need to be a factor) mtcars2 <- mtcars mtcars2$cyl <- as.factor(mtcars2$cyl) model <- lm(mpg ~ cyl + wt * hp, mtcars2) set.seed(100) nice_lm_contrasts(model, group = "cyl", data = mtcars, bootstraps = 500) # Several models at once mtcars2$gear <- as.factor(mtcars2$gear) model2 <- lm(qsec ~ cyl, data = mtcars2) my.models <- list(model, model2) set.seed(100) nice_lm_contrasts(my.models, group = "cyl", data = mtcars, bootstraps = 500) # Now supports more than 3 levels mtcars2$carb <- as.factor(mtcars2$carb) model <- lm(mpg ~ carb + wt * hp, mtcars2) set.seed(100) nice_lm_contrasts(model, group = "carb", data = mtcars2, bootstraps = 500)
Extracts simple slopes from lm()
model
object and format for a publication-ready format.
nice_lm_slopes( model, predictor, moderator, b.label = "b", standardize = FALSE, mod.id = TRUE, ci.alternative = "two.sided", ... )
nice_lm_slopes( model, predictor, moderator, b.label = "b", standardize = FALSE, mod.id = TRUE, ci.alternative = "two.sided", ... )
model |
The model to be formatted. |
predictor |
The independent variable. |
moderator |
The moderating variable. |
b.label |
What to rename the default "b" column (e.g.,
to capital B if using standardized data for it to be converted
to the Greek beta symbol in the |
standardize |
Logical, whether to standardize the
data before refitting the model. If |
mod.id |
Logical. Whether to display the model number, when there is more than one model. |
ci.alternative |
Alternative for the confidence interval of the sr2. It can be either "two.sided (the default in this package), "greater", or "less". |
... |
Further arguments to be passed to the |
The effect size, sr2 (semi-partial correlation squared, also
known as delta R2), is computed through effectsize::r2_semipartial.
Please read the documentation for that function, especially regarding
the interpretation of the confidence interval. In rempsyc
, instead
of using the default one-sided alternative ("greater"), we use the
two-sided alternative.
To interpret the sr2, use effectsize::interpret_r2_semipartial()
.
For the easystats equivalent, use report::report()
on the lm()
model object.
A formatted dataframe of the simple slopes of the specified lm model, with DV, levels of IV, degrees of freedom, regression coefficient, t-value, p-value, and the effect size, the semi-partial correlation squared, and its confidence interval.
Checking for moderation before checking simple slopes:
nice_lm
, nice_mod
,
nice_slopes
. Tutorial:
https://rempsyc.remi-theriault.com/articles/moderation
# Make and format model model <- lm(mpg ~ gear * wt, mtcars) nice_lm_slopes(model, predictor = "gear", moderator = "wt") # Make and format multiple models model2 <- lm(qsec ~ gear * wt, mtcars) my.models <- list(model, model2) x <- nice_lm_slopes(my.models, predictor = "gear", moderator = "wt") x # Get interpretations cbind(x, Interpretation = effectsize::interpret_r2_semipartial(x$sr2))
# Make and format model model <- lm(mpg ~ gear * wt, mtcars) nice_lm_slopes(model, predictor = "gear", moderator = "wt") # Make and format multiple models model2 <- lm(qsec ~ gear * wt, mtcars) my.models <- list(model, model2) x <- nice_lm_slopes(my.models, predictor = "gear", moderator = "wt") x # Get interpretations cbind(x, Interpretation = effectsize::interpret_r2_semipartial(x$sr2))
Easily compute moderation analyses, with effect sizes, and format in publication-ready format.
nice_mod( data, response, predictor, moderator, moderator2 = NULL, covariates = NULL, b.label = "b", standardize = TRUE, mod.id = TRUE, ci.alternative = "two.sided", ... )
nice_mod( data, response, predictor, moderator, moderator2 = NULL, covariates = NULL, b.label = "b", standardize = TRUE, mod.id = TRUE, ci.alternative = "two.sided", ... )
data |
The data frame |
response |
The dependent variable. |
predictor |
The independent variable. |
moderator |
The moderating variable. |
moderator2 |
The second moderating variable, if applicable. |
covariates |
The desired covariates in the model. |
b.label |
What to rename the default "b" column (e.g.,
to capital B if using standardized data for it to be converted
to the Greek beta symbol in the |
standardize |
Logical, whether to standardize the
data before fitting the model. If |
mod.id |
Logical. Whether to display the model number, when there is more than one model. |
ci.alternative |
Alternative for the confidence interval of the sr2. It can be either "two.sided (the default in this package), "greater", or "less". |
... |
Further arguments to be passed to the |
The effect size, sr2 (semi-partial correlation squared, also
known as delta R2), is computed through effectsize::r2_semipartial.
Please read the documentation for that function, especially regarding
the interpretation of the confidence interval. In rempsyc
, instead
of using the default one-sided alternative ("greater"), we use the
two-sided alternative.
To interpret the sr2, use effectsize::interpret_r2_semipartial()
.
For the easystats equivalent, use report::report()
on the lm()
model object.
A formatted dataframe of the specified lm model, with DV, IV, degrees of freedom, regression coefficient, t-value, p-value, and the effect size, the semi-partial correlation squared, and its confidence interval.
Checking simple slopes after testing for moderation:
nice_slopes
, nice_lm
,
nice_lm_slopes
. Tutorial:
https://rempsyc.remi-theriault.com/articles/moderation
# Make the basic table nice_mod( data = mtcars, response = "mpg", predictor = "gear", moderator = "wt" ) # Multiple dependent variables at once nice_mod( data = mtcars, response = c("mpg", "disp", "hp"), predictor = "gear", moderator = "wt" ) # Add covariates nice_mod( data = mtcars, response = "mpg", predictor = "gear", moderator = "wt", covariates = c("am", "vs") ) # Three-way interaction x <- nice_mod( data = mtcars, response = "mpg", predictor = "gear", moderator = "wt", moderator2 = "am" ) x # Get interpretations cbind(x, Interpretation = effectsize::interpret_r2_semipartial(x$sr2))
# Make the basic table nice_mod( data = mtcars, response = "mpg", predictor = "gear", moderator = "wt" ) # Multiple dependent variables at once nice_mod( data = mtcars, response = c("mpg", "disp", "hp"), predictor = "gear", moderator = "wt" ) # Add covariates nice_mod( data = mtcars, response = "mpg", predictor = "gear", moderator = "wt", covariates = c("am", "vs") ) # Three-way interaction x <- nice_mod( data = mtcars, response = "mpg", predictor = "gear", moderator = "wt", moderator2 = "am" ) x # Get interpretations cbind(x, Interpretation = effectsize::interpret_r2_semipartial(x$sr2))
Nicely reports NA values according to existing guidelines. This function reports both absolute and percentage values of specified column lists. Some authors recommend reporting item-level missing item per scale, as well as participant’s maximum number of missing items by scale. For example, Parent (2013) writes:
I recommend that authors (a) state their tolerance level for missing data by scale or subscale (e.g., “We calculated means for all subscales on which participants gave at least 75% complete data”) and then (b) report the individual missingness rates by scale per data point (i.e., the number of missing values out of all data points on that scale for all participants) and the maximum by participant (e.g., “For Attachment Anxiety, a total of 4 missing data points out of 100 were observed, with no participant missing more than a single data point”).
nice_na(data, vars = NULL, scales = NULL)
nice_na(data, vars = NULL, scales = NULL)
data |
The data frame. |
vars |
Variable (or lists of variables) to check for NAs. |
scales |
The scale names to check for NAs (single character string). |
A dataframe, with:
var
: variables selected
items
: number of items for selected variables
na
: number of missing cell values for those variables (e.g., 2 missing
values for first participant + 2 missing values for second participant
= total of 4 missing values)
cells
: total number of cells (i.e., number of participants multiplied by
number of variables, items
)
na_percent
: the percentage of missing values (number of missing cells,
na
, divided by total number of cells, cells
)
na_max
: The amount of missing values for the participant with the most
missing values for the selected variables
na_max_percent
: The amount of missing values for the participant with
the most missing values for the selected variables, in percentage
(i.e., na_max
divided by the number of selected variables, items
)
all_na
: the number of participants missing 100% of items for that scale
(the selected variables)
Parent, M. C. (2013). Handling item-level missing data: Simpler is just as good. The Counseling Psychologist, 41(4), 568-600. https://doi.org/10.1177%2F0011000012445176
# Use whole data frame nice_na(airquality) # Use selected columns explicitly nice_na(airquality, vars = list( c("Ozone", "Solar.R", "Wind"), c("Temp", "Month", "Day") ) ) # If the questionnaire items start with the same name, e.g., set.seed(15) fun <- function() { c(sample(c(NA, 1:10), replace = TRUE), NA, NA, NA) } df <- data.frame( ID = c("idz", NA), open_1 = fun(), open_2 = fun(), open_3 = fun(), extrovert_1 = fun(), extrovert_2 = fun(), extrovert_3 = fun(), agreeable_1 = fun(), agreeable_2 = fun(), agreeable_3 = fun() ) # One can list the scale names directly: nice_na(df, scales = c("ID", "open", "extrovert", "agreeable"))
# Use whole data frame nice_na(airquality) # Use selected columns explicitly nice_na(airquality, vars = list( c("Ozone", "Solar.R", "Wind"), c("Temp", "Month", "Day") ) ) # If the questionnaire items start with the same name, e.g., set.seed(15) fun <- function() { c(sample(c(NA, 1:10), replace = TRUE), NA, NA, NA) } df <- data.frame( ID = c("idz", NA), open_1 = fun(), open_2 = fun(), open_3 = fun(), extrovert_1 = fun(), extrovert_2 = fun(), extrovert_3 = fun(), agreeable_1 = fun(), agreeable_2 = fun(), agreeable_3 = fun() ) # One can list the scale names directly: nice_na(df, scales = c("ID", "open", "extrovert", "agreeable"))
Easily make nice per-group density and QQ plots
through a wrapper around the ggplot2
and qqplotr
packages.
nice_normality( data, variable, group = NULL, colours, groups.labels, grid = TRUE, shapiro = FALSE, title = NULL, histogram = FALSE, breaks.auto = FALSE, ... )
nice_normality( data, variable, group = NULL, colours, groups.labels, grid = TRUE, shapiro = FALSE, title = NULL, histogram = FALSE, breaks.auto = FALSE, ... )
data |
The data frame. |
variable |
The dependent variable to be plotted. |
group |
The group by which to plot the variable. |
colours |
Desired colours for the plot, if desired. |
groups.labels |
How to label the groups. |
grid |
Logical, whether to keep the default background grid or not. APA style suggests not using a grid in the background, though in this case some may find it useful to more easily estimate the slopes of the different groups. |
shapiro |
Logical, whether to include the p-value from the Shapiro-Wilk test on the plot. |
title |
An optional title, if desired. |
histogram |
Logical, whether to add an histogram on top of the density plot. |
breaks.auto |
If histogram = TRUE, then option to set bins/breaks
automatically, mimicking the default behaviour of
base R |
... |
Further arguments from |
A plot of classes patchwork and ggplot, containing two plots,
resulting from nice_density
and nice_qq
.
Other functions useful in assumption testing:
nice_assumptions
, nice_density
,
nice_qq
, nice_var
,
nice_varplot
. Tutorial:
https://rempsyc.remi-theriault.com/articles/assumptions
# Make the basic plot nice_normality( data = iris, variable = "Sepal.Length", group = "Species" ) # Further customization nice_normality( data = iris, variable = "Sepal.Length", group = "Species", colours = c( "#00BA38", "#619CFF", "#F8766D" ), groups.labels = c( "(a) Setosa", "(b) Versicolor", "(c) Virginica" ), grid = FALSE, shapiro = TRUE )
# Make the basic plot nice_normality( data = iris, variable = "Sepal.Length", group = "Species" ) # Further customization nice_normality( data = iris, variable = "Sepal.Length", group = "Species", colours = c( "#00BA38", "#619CFF", "#F8766D" ), groups.labels = c( "(a) Setosa", "(b) Versicolor", "(c) Virginica" ), grid = FALSE, shapiro = TRUE )
Easily make nice per-group QQ plots through
a wrapper around the ggplot2
and qqplotr
packages.
nice_qq( data, variable, group = NULL, colours, groups.labels = NULL, grid = TRUE, shapiro = FALSE, title = variable )
nice_qq( data, variable, group = NULL, colours, groups.labels = NULL, grid = TRUE, shapiro = FALSE, title = variable )
data |
The data frame. |
variable |
The dependent variable to be plotted. |
group |
The group by which to plot the variable. |
colours |
Desired colours for the plot, if desired. |
groups.labels |
How to label the groups. |
grid |
Logical, whether to keep the default background grid or not. APA style suggests not using a grid in the background, though in this case some may find it useful to more easily estimate the slopes of the different groups. |
shapiro |
Logical, whether to include the p-value from the Shapiro-Wilk test on the plot. |
title |
An optional title, if desired. |
A qq plot of class ggplot, by group (if provided), along a reference interpretation helper, the 95% confidence band.
Other functions useful in assumption testing:
nice_assumptions
, nice_density
,
nice_normality
, nice_var
,
nice_varplot
. Tutorial:
https://rempsyc.remi-theriault.com/articles/assumptions
# Make the basic plot nice_qq( data = iris, variable = "Sepal.Length", group = "Species" ) # Further customization nice_qq( data = iris, variable = "Sepal.Length", group = "Species", colours = c("#00BA38", "#619CFF", "#F8766D"), groups.labels = c("(a) Setosa", "(b) Versicolor", "(c) Virginica"), grid = FALSE, shapiro = TRUE, title = NULL )
# Make the basic plot nice_qq( data = iris, variable = "Sepal.Length", group = "Species" ) # Further customization nice_qq( data = iris, variable = "Sepal.Length", group = "Species", colours = c("#00BA38", "#619CFF", "#F8766D"), groups.labels = c("(a) Setosa", "(b) Versicolor", "(c) Virginica"), grid = FALSE, shapiro = TRUE, title = NULL )
Randomize easily with different designs.
nice_randomize( design = "between", Ncondition = 3, n = 9, condition.names = c("a", "b", "c"), col.names = c("id", "Condition") )
nice_randomize( design = "between", Ncondition = 3, n = 9, condition.names = c("a", "b", "c"), col.names = c("id", "Condition") )
design |
The design: either between-subject (different groups) or within-subject (repeated-measures on same people). |
Ncondition |
The number of conditions you want to randomize. |
n |
The desired sample size. Note that it needs to
be a multiple of your number of groups if you are using |
condition.names |
The names of the randomized conditions. |
col.names |
The desired additional column names for a runsheet. |
A dataframe, with participant ID and randomized condition, based on selected design.
Tutorial: https://rempsyc.remi-theriault.com/articles/randomize
# Specify design, number of conditions, number of # participants, and names of conditions: nice_randomize( design = "between", Ncondition = 4, n = 8, condition.names = c("BP", "CX", "PZ", "ZL") ) # Within-Group Design nice_randomize( design = "within", Ncondition = 4, n = 6, condition.names = c("SV", "AV", "ST", "AT") ) # Make a quick runsheet randomized <- nice_randomize( design = "within", Ncondition = 4, n = 128, condition.names = c("SV", "AV", "ST", "AT"), col.names = c( "id", "Condition", "Date/Time", "SONA ID", "Age/Gd.", "Handedness", "Tester", "Notes" ) ) head(randomized)
# Specify design, number of conditions, number of # participants, and names of conditions: nice_randomize( design = "between", Ncondition = 4, n = 8, condition.names = c("BP", "CX", "PZ", "ZL") ) # Within-Group Design nice_randomize( design = "within", Ncondition = 4, n = 6, condition.names = c("SV", "AV", "ST", "AT") ) # Make a quick runsheet randomized <- nice_randomize( design = "within", Ncondition = 4, n = 128, condition.names = c("SV", "AV", "ST", "AT"), col.names = c( "id", "Condition", "Date/Time", "SONA ID", "Age/Gd.", "Handedness", "Tester", "Notes" ) ) head(randomized)
Easily recode scores (reverse-score), typically for questionnaire answers.
For the easystats equivalent, see:
datawizard::reverse()
.
nice_reverse(x, max, min = 1)
nice_reverse(x, max, min = 1)
x |
The score to reverse. |
max |
The maximum score on the scale. |
min |
The minimum score on the scale (optional unless it isn't 1). |
A numeric vector, of reversed scores.
# Reverse score of 5 with a maximum score of 5 nice_reverse(5, 5) # Reverse several scores at once nice_reverse(1:5, 5) # Reverse scores with maximum = 4 and minimum = 0 nice_reverse(1:4, 4, min = 0) # Reverse scores with maximum = 3 and minimum = -3 nice_reverse(-3:3, 3, min = -3)
# Reverse score of 5 with a maximum score of 5 nice_reverse(5, 5) # Reverse several scores at once nice_reverse(1:5, 5) # Reverse scores with maximum = 4 and minimum = 0 nice_reverse(1:4, 4, min = 0) # Reverse scores with maximum = 3 and minimum = -3 nice_reverse(-3:3, 3, min = -3)
Make nice scatter plots easily.
nice_scatter( data, predictor, response, xtitle = predictor, ytitle = response, has.points = TRUE, has.jitter = FALSE, alpha = 0.7, has.line = TRUE, method = "lm", has.confband = FALSE, has.fullrange = FALSE, has.linetype = FALSE, has.shape = FALSE, xmin, xmax, xby = 1, ymin, ymax, yby = 1, has.legend = FALSE, legend.title = "", group = NULL, colours = "#619CFF", groups.order = "none", groups.labels = NULL, groups.alpha = NULL, has.r = FALSE, r.x = Inf, r.y = -Inf, has.p = FALSE, p.x = Inf, p.y = -Inf )
nice_scatter( data, predictor, response, xtitle = predictor, ytitle = response, has.points = TRUE, has.jitter = FALSE, alpha = 0.7, has.line = TRUE, method = "lm", has.confband = FALSE, has.fullrange = FALSE, has.linetype = FALSE, has.shape = FALSE, xmin, xmax, xby = 1, ymin, ymax, yby = 1, has.legend = FALSE, legend.title = "", group = NULL, colours = "#619CFF", groups.order = "none", groups.labels = NULL, groups.alpha = NULL, has.r = FALSE, r.x = Inf, r.y = -Inf, has.p = FALSE, p.x = Inf, p.y = -Inf )
data |
The data frame. |
predictor |
The independent variable to be plotted. |
response |
The dependent variable to be plotted. |
xtitle |
An optional y-axis label, if desired. |
ytitle |
An optional x-axis label, if desired. |
has.points |
Whether to plot the individual observations or not. |
has.jitter |
Alternative to |
alpha |
The desired level of transparency. |
has.line |
Whether to plot the regression line(s). |
method |
Which method to use for the regression line,
either |
has.confband |
Logical. Whether to display the confidence band around the slope. |
has.fullrange |
Logical. Whether to extend the slope beyond the range of observations. |
has.linetype |
Logical. Whether to change line types as a function of group. |
has.shape |
Logical. Whether to change shape of observations as a function of group. |
xmin |
The minimum score on the x-axis scale. |
xmax |
The maximum score on the x-axis scale. |
xby |
How much to increase on each "tick" on the x-axis scale. |
ymin |
The minimum score on the y-axis scale. |
ymax |
The maximum score on the y-axis scale. |
yby |
How much to increase on each "tick" on the y-axis scale. |
has.legend |
Logical. Whether to display the legend or not. |
legend.title |
The desired legend title. |
group |
The group by which to plot the variable |
colours |
Desired colours for the plot, if desired. |
groups.order |
Specifies the desired display order of the groups on the legend. Either provide the levels directly, or a string: "increasing" or "decreasing", to order based on the average value of the variable on the y axis, or "string.length", to order from the shortest to the longest string (useful when working with long string names). "Defaults to "none". |
groups.labels |
Changes groups names (labels). Note: This applies after changing order of level. |
groups.alpha |
The manually specified transparency desired for the groups slopes. Use only when plotting groups separately. |
has.r |
Whether to display the correlation coefficient, the r-value. |
r.x |
The x-axis coordinates for the r-value. |
r.y |
The y-axis coordinates for the r-value. |
has.p |
Whether to display the p-value. |
p.x |
The x-axis coordinates for the p-value. |
p.y |
The y-axis coordinates for the p-value. |
A scatter plot of class ggplot.
Visualize group differences via violin plots:
nice_violin
. Tutorial:
https://rempsyc.remi-theriault.com/articles/scatter
# Make the basic plot nice_scatter( data = mtcars, predictor = "wt", response = "mpg" ) # Save a high-resolution image file to specified directory ggplot2::ggsave("nicescatterplothere.pdf", width = 7, height = 7, unit = "in", dpi = 300 ) # change for your own desired path # Change x- and y- axis labels nice_scatter( data = mtcars, predictor = "wt", response = "mpg", ytitle = "Miles/(US) gallon", xtitle = "Weight (1000 lbs)" ) # Have points "jittered", loess method nice_scatter( data = mtcars, predictor = "wt", response = "mpg", has.jitter = TRUE, method = "loess" ) # Change the transparency of the points nice_scatter( data = mtcars, predictor = "wt", response = "mpg", alpha = 1 ) # Remove points nice_scatter( data = mtcars, predictor = "wt", response = "mpg", has.points = FALSE, has.jitter = FALSE ) # Add confidence band nice_scatter( data = mtcars, predictor = "wt", response = "mpg", has.confband = TRUE ) # Set x- and y- scales manually nice_scatter( data = mtcars, predictor = "wt", response = "mpg", xmin = 1, xmax = 6, xby = 1, ymin = 10, ymax = 35, yby = 5 ) # Change plot colour nice_scatter( data = mtcars, predictor = "wt", response = "mpg", colours = "blueviolet" ) # Add correlation coefficient to plot and p-value nice_scatter( data = mtcars, predictor = "wt", response = "mpg", has.r = TRUE, has.p = TRUE ) # Change location of correlation coefficient or p-value nice_scatter( data = mtcars, predictor = "wt", response = "mpg", has.r = TRUE, r.x = 4, r.y = 25, has.p = TRUE, p.x = 5, p.y = 20 ) # Plot by group nice_scatter( data = mtcars, predictor = "wt", response = "mpg", group = "cyl" ) # Use full range on the slope/confidence band nice_scatter( data = mtcars, predictor = "wt", response = "mpg", group = "cyl", has.fullrange = TRUE ) # Remove lines nice_scatter( data = mtcars, predictor = "wt", response = "mpg", group = "cyl", has.line = FALSE ) # Change order of labels on the legend nice_scatter( data = mtcars, predictor = "wt", response = "mpg", group = "cyl", groups.order = c(8, 4, 6) ) # Change legend labels nice_scatter( data = mtcars, predictor = "wt", response = "mpg", group = "cyl", groups.labels = c("Weak", "Average", "Powerful") ) # Warning: This applies after changing order of level # Add a title to legend nice_scatter( data = mtcars, predictor = "wt", response = "mpg", group = "cyl", legend.title = "cylinders" ) # Plot by group + manually specify colours nice_scatter( data = mtcars, predictor = "wt", response = "mpg", group = "cyl", colours = c("burlywood", "darkgoldenrod", "chocolate") ) # Plot by group + use different line types for each group nice_scatter( data = mtcars, predictor = "wt", response = "mpg", group = "cyl", has.linetype = TRUE ) # Plot by group + use different point shapes for each group nice_scatter( data = mtcars, predictor = "wt", response = "mpg", group = "cyl", has.shape = TRUE )
# Make the basic plot nice_scatter( data = mtcars, predictor = "wt", response = "mpg" ) # Save a high-resolution image file to specified directory ggplot2::ggsave("nicescatterplothere.pdf", width = 7, height = 7, unit = "in", dpi = 300 ) # change for your own desired path # Change x- and y- axis labels nice_scatter( data = mtcars, predictor = "wt", response = "mpg", ytitle = "Miles/(US) gallon", xtitle = "Weight (1000 lbs)" ) # Have points "jittered", loess method nice_scatter( data = mtcars, predictor = "wt", response = "mpg", has.jitter = TRUE, method = "loess" ) # Change the transparency of the points nice_scatter( data = mtcars, predictor = "wt", response = "mpg", alpha = 1 ) # Remove points nice_scatter( data = mtcars, predictor = "wt", response = "mpg", has.points = FALSE, has.jitter = FALSE ) # Add confidence band nice_scatter( data = mtcars, predictor = "wt", response = "mpg", has.confband = TRUE ) # Set x- and y- scales manually nice_scatter( data = mtcars, predictor = "wt", response = "mpg", xmin = 1, xmax = 6, xby = 1, ymin = 10, ymax = 35, yby = 5 ) # Change plot colour nice_scatter( data = mtcars, predictor = "wt", response = "mpg", colours = "blueviolet" ) # Add correlation coefficient to plot and p-value nice_scatter( data = mtcars, predictor = "wt", response = "mpg", has.r = TRUE, has.p = TRUE ) # Change location of correlation coefficient or p-value nice_scatter( data = mtcars, predictor = "wt", response = "mpg", has.r = TRUE, r.x = 4, r.y = 25, has.p = TRUE, p.x = 5, p.y = 20 ) # Plot by group nice_scatter( data = mtcars, predictor = "wt", response = "mpg", group = "cyl" ) # Use full range on the slope/confidence band nice_scatter( data = mtcars, predictor = "wt", response = "mpg", group = "cyl", has.fullrange = TRUE ) # Remove lines nice_scatter( data = mtcars, predictor = "wt", response = "mpg", group = "cyl", has.line = FALSE ) # Change order of labels on the legend nice_scatter( data = mtcars, predictor = "wt", response = "mpg", group = "cyl", groups.order = c(8, 4, 6) ) # Change legend labels nice_scatter( data = mtcars, predictor = "wt", response = "mpg", group = "cyl", groups.labels = c("Weak", "Average", "Powerful") ) # Warning: This applies after changing order of level # Add a title to legend nice_scatter( data = mtcars, predictor = "wt", response = "mpg", group = "cyl", legend.title = "cylinders" ) # Plot by group + manually specify colours nice_scatter( data = mtcars, predictor = "wt", response = "mpg", group = "cyl", colours = c("burlywood", "darkgoldenrod", "chocolate") ) # Plot by group + use different line types for each group nice_scatter( data = mtcars, predictor = "wt", response = "mpg", group = "cyl", has.linetype = TRUE ) # Plot by group + use different point shapes for each group nice_scatter( data = mtcars, predictor = "wt", response = "mpg", group = "cyl", has.shape = TRUE )
Easily compute simple slopes in moderation analysis, with effect sizes, and format in publication-ready format.
nice_slopes( data, response, predictor, moderator, moderator2 = NULL, covariates = NULL, b.label = "b", standardize = TRUE, mod.id = TRUE, ci.alternative = "two.sided", ... )
nice_slopes( data, response, predictor, moderator, moderator2 = NULL, covariates = NULL, b.label = "b", standardize = TRUE, mod.id = TRUE, ci.alternative = "two.sided", ... )
data |
The data frame |
response |
The dependent variable. |
predictor |
The independent variable |
moderator |
The moderating variable. |
moderator2 |
The second moderating variable, if applicable.
At this time, the second moderator variable can only be a
binary variable of the form |
covariates |
The desired covariates in the model. |
b.label |
What to rename the default "b" column
(e.g., to capital B if using standardized data for it
to be converted to the Greek beta symbol in the |
standardize |
Logical, whether to standardize the
data before fitting the model. If TRUE, automatically sets
|
mod.id |
Logical. Whether to display the model number, when there is more than one model. |
ci.alternative |
Alternative for the confidence interval of the sr2. It can be either "two.sided (the default in this package), "greater", or "less". |
... |
Further arguments to be passed to the |
The effect size, sr2 (semi-partial correlation squared, also
known as delta R2), is computed through effectsize::r2_semipartial.
Please read the documentation for that function, especially regarding
the interpretation of the confidence interval. In rempsyc
, instead
of using the default one-sided alternative ("greater"), we use the
two-sided alternative.
To interpret the sr2, use effectsize::interpret_r2_semipartial()
.
For the easystats equivalent, use report::report()
on the lm()
model object.
A formatted dataframe of the simple slopes of the specified lm model, with DV, levels of IV, degrees of freedom, regression coefficient, t-value, p-value, and the effect size, the semi-partial correlation squared, and its confidence interval.
Checking for moderation before checking simple slopes:
nice_mod
, nice_lm
,
nice_lm_slopes
. Tutorial:
https://rempsyc.remi-theriault.com/articles/moderation
# Make the basic table nice_slopes( data = mtcars, response = "mpg", predictor = "gear", moderator = "wt" ) # Multiple dependent variables at once nice_slopes( data = mtcars, response = c("mpg", "disp", "hp"), predictor = "gear", moderator = "wt" ) # Add covariates nice_slopes( data = mtcars, response = "mpg", predictor = "gear", moderator = "wt", covariates = c("am", "vs") ) # Three-way interaction (continuous moderator and binary # second moderator required) x <- nice_slopes( data = mtcars, response = "mpg", predictor = "gear", moderator = "wt", moderator2 = "am" ) x # Get interpretations cbind(x, Interpretation = effectsize::interpret_r2_semipartial(x$sr2))
# Make the basic table nice_slopes( data = mtcars, response = "mpg", predictor = "gear", moderator = "wt" ) # Multiple dependent variables at once nice_slopes( data = mtcars, response = c("mpg", "disp", "hp"), predictor = "gear", moderator = "wt" ) # Add covariates nice_slopes( data = mtcars, response = "mpg", predictor = "gear", moderator = "wt", covariates = c("am", "vs") ) # Three-way interaction (continuous moderator and binary # second moderator required) x <- nice_slopes( data = mtcars, response = "mpg", predictor = "gear", moderator = "wt", moderator2 = "am" ) x # Get interpretations cbind(x, Interpretation = effectsize::interpret_r2_semipartial(x$sr2))
Easily compute t-test analyses, with effect sizes,
and format in publication-ready format. The 95% confidence interval
is for the effect size, Cohen's d, both provided by the effectsize
package.
nice_t_test( data, response, group = NULL, correction = "none", paired = FALSE, verbose = TRUE, ... )
nice_t_test( data, response, group = NULL, correction = "none", paired = FALSE, verbose = TRUE, ... )
data |
The data frame. |
response |
The dependent variable. |
group |
The group for the comparison. |
correction |
What correction for multiple comparison to apply, if any. Default is "none" and the only other option (for now) is "bonferroni". |
paired |
Whether to use a paired t-test. |
verbose |
Whether to display the Welch test warning or not. |
... |
Further arguments to be passed to the |
This function relies on the base R t.test()
function, which
uses the Welch t-test per default (see why here:
https://daniellakens.blogspot.com/2015/01/always-use-welchs-t-test-instead-of.html).
To use the Student t-test, simply add the following
argument: var.equal = TRUE
.
Note that for paired t tests, you need to use paired = TRUE
, and
you also need data in "long" format rather than wide format (like for
the ToothGrowth
data set). In this case, the group
argument refers
to the participant ID for example, so the same group/participant is
measured several times, and thus has several rows. Note also that R >= 4.4.0
has stopped supporting the paired
argument for the formula method used
internally here.
For the easystats equivalent, use: report::report()
on the
t.test()
object.
A formatted dataframe of the specified model, with DV, degrees of freedom, t-value, p-value, the effect size, Cohen's d, and its 95% confidence interval lower and upper bounds.
Tutorial: https://rempsyc.remi-theriault.com/articles/t-test
# Make the basic table nice_t_test( data = mtcars, response = "mpg", group = "am" ) # Multiple dependent variables at once nice_t_test( data = mtcars, response = names(mtcars)[1:7], group = "am" ) # Can be passed some of the regular arguments # of base [t.test()] # Student t-test (instead of Welch) nice_t_test( data = mtcars, response = "mpg", group = "am", var.equal = TRUE ) # One-sided instead of two-sided nice_t_test( data = mtcars, response = "mpg", group = "am", alternative = "less" ) # One-sample t-test nice_t_test( data = mtcars, response = "mpg", mu = 10 ) # Make sure cases appear in the same order for # both levels of the grouping factor
# Make the basic table nice_t_test( data = mtcars, response = "mpg", group = "am" ) # Multiple dependent variables at once nice_t_test( data = mtcars, response = names(mtcars)[1:7], group = "am" ) # Can be passed some of the regular arguments # of base [t.test()] # Student t-test (instead of Welch) nice_t_test( data = mtcars, response = "mpg", group = "am", var.equal = TRUE ) # One-sided instead of two-sided nice_t_test( data = mtcars, response = "mpg", group = "am", alternative = "less" ) # One-sample t-test nice_t_test( data = mtcars, response = "mpg", mu = 10 ) # Make sure cases appear in the same order for # both levels of the grouping factor
Make nice APA tables easily through a wrapper
around the flextable
package with sensical defaults and
automatic formatting features.
nice_table( data, highlight = FALSE, stars = TRUE, italics, col.format.p, col.format.r, col.format.ci, format.custom, col.format.custom, width = NULL, spacing = 2, broom = NULL, report = NULL, short = FALSE, title, note, separate.header )
nice_table( data, highlight = FALSE, stars = TRUE, italics, col.format.p, col.format.r, col.format.ci, format.custom, col.format.custom, width = NULL, spacing = 2, broom = NULL, report = NULL, short = FALSE, title, note, separate.header )
data |
The data frame, to be converted to a flextable. The data frame cannot have duplicate column names. |
highlight |
Highlight rows with statistically significant results? Requires a column named "p" containing p-values. Can either accept logical (TRUE/FALSE) OR a numeric value for a custom critical p-value threshold (e.g., 0.10 or 0.001). |
stars |
Logical. Whether to add asterisks for significant p values. |
italics |
Which columns headers should be italic? Useful for column names that should be italic but that are not picked up automatically by the function. Select with numerical range, e.g., 1:3. |
col.format.p |
Applies p-value formatting to columns that cannot be named "p" (for example for a data frame full of p-values, also because it is not possible to have more than one column named "p"). Select with numerical range, e.g., 1:3. |
col.format.r |
Applies r-value formatting to columns that cannot be named "r" (for example for a data frame full of r-values, also because it is not possible to have more than one column named "r"). Select with numerical range, e.g., 1:3. |
col.format.ci |
Applies 95% confidence interval formatting to selected columns (e.g., when reporting more than one interval). |
format.custom |
Applies custom formatting to columns
selected via the |
col.format.custom |
Which columns to apply the custom function to. Select with numerical range, e.g., 1:3. |
width |
Width of the table, in percentage of the
total width, when exported e.g., to Word. For full width,
use |
spacing |
Spacing of the rows (1 = single space, 2 = double space) |
broom |
If providing a tidy table produced with the
|
report |
If providing an object produced with the
|
short |
Logical. Whether to return an abbreviated
version of the tables made by the |
title |
Optional, to add a table header, if desired. |
note |
Optional, to add one or more table footnote (APA note), if desired. |
separate.header |
Logical, whether to separate headers based on name delimiters (i.e., periods "."). |
The resulting flextable
objects can be opened in
Word with print(table, preview ="docx")
, or saved to
Word with the flextable::save_as_docx()
function.
An APA-formatted table of class "flextable"
Tutorial: https://rempsyc.remi-theriault.com/articles/table
# Make the basic table my_table <- nice_table( mtcars[1:3, ], title = c("Table 1", "Motor Trend Car Road Tests"), note = c( "The data was extracted from the 1974 Motor Trend US magazine.", "* p < .05, ** p < .01, *** p < .001" ) ) my_table # Save table to word mypath <- tempfile(fileext = ".docx") flextable::save_as_docx(my_table, path = mypath) # Publication-ready tables mtcars.std <- lapply(mtcars, scale) model <- lm(mpg ~ cyl + wt * hp, mtcars.std) stats.table <- as.data.frame(summary(model)$coefficients) CI <- confint(model) stats.table <- cbind( row.names(stats.table), stats.table, CI ) names(stats.table) <- c( "Term", "B", "SE", "t", "p", "CI_lower", "CI_upper" ) nice_table(stats.table, highlight = TRUE) # Test different column names test <- head(mtcars) names(test) <- c( "dR", "N", "M", "SD", "b", "np2", "ges", "p", "r", "R2", "sr2" ) test[, 10:11] <- test[, 10:11] / 10 nice_table(test) # Custom cell formatting (such as p or r) nice_table(test[8:11], col.format.p = 2:4, highlight = .001) nice_table(test[8:11], col.format.r = 1:4) # Apply custom functions to cells fun <- function(x) { x + 11.1 } nice_table(test[8:11], col.format.custom = 2:4, format.custom = "fun") fun <- function(x) { paste("x", x) } nice_table(test[8:11], col.format.custom = 2:4, format.custom = "fun") # Separate headers based on periods header.data <- structure( list( Variable = c( "Sepal.Length", "Sepal.Width", "Petal.Length" ), setosa.M = c( 5.01, 3.43, 1.46 ), setosa.SD = c(0.35, 0.38, 0.17), versicolor.M = c(5.94, 2.77, 4.26), versicolor.SD = c(0.52, 0.31, 0.47) ), row.names = c(NA, -3L), class = "data.frame" ) nice_table(header.data, separate.header = TRUE, italics = 2:4 )
# Make the basic table my_table <- nice_table( mtcars[1:3, ], title = c("Table 1", "Motor Trend Car Road Tests"), note = c( "The data was extracted from the 1974 Motor Trend US magazine.", "* p < .05, ** p < .01, *** p < .001" ) ) my_table # Save table to word mypath <- tempfile(fileext = ".docx") flextable::save_as_docx(my_table, path = mypath) # Publication-ready tables mtcars.std <- lapply(mtcars, scale) model <- lm(mpg ~ cyl + wt * hp, mtcars.std) stats.table <- as.data.frame(summary(model)$coefficients) CI <- confint(model) stats.table <- cbind( row.names(stats.table), stats.table, CI ) names(stats.table) <- c( "Term", "B", "SE", "t", "p", "CI_lower", "CI_upper" ) nice_table(stats.table, highlight = TRUE) # Test different column names test <- head(mtcars) names(test) <- c( "dR", "N", "M", "SD", "b", "np2", "ges", "p", "r", "R2", "sr2" ) test[, 10:11] <- test[, 10:11] / 10 nice_table(test) # Custom cell formatting (such as p or r) nice_table(test[8:11], col.format.p = 2:4, highlight = .001) nice_table(test[8:11], col.format.r = 1:4) # Apply custom functions to cells fun <- function(x) { x + 11.1 } nice_table(test[8:11], col.format.custom = 2:4, format.custom = "fun") fun <- function(x) { paste("x", x) } nice_table(test[8:11], col.format.custom = 2:4, format.custom = "fun") # Separate headers based on periods header.data <- structure( list( Variable = c( "Sepal.Length", "Sepal.Width", "Petal.Length" ), setosa.M = c( 5.01, 3.43, 1.46 ), setosa.SD = c(0.35, 0.38, 0.17), versicolor.M = c(5.94, 2.77, 4.26), versicolor.SD = c(0.52, 0.31, 0.47) ), row.names = c(NA, -3L), class = "data.frame" ) nice_table(header.data, separate.header = TRUE, italics = 2:4 )
Obtain variance per group as well as check for the rule of thumb of one group having variance four times bigger than any of the other groups. Variance ratio is calculated as Max / Min.
nice_var(data, variable, group, criteria = 4)
nice_var(data, variable, group, criteria = 4)
data |
The data frame |
variable |
The dependent variable to be plotted. |
group |
The group by which to plot the variable. |
criteria |
Desired threshold if one wants something different than four times the variance. |
A dataframe, with the values of the selected variables for each group, their max variance ratio (maximum variance divided by the minimum variance), the selected decision criterion, and whether the data are considered heteroscedastic according to the decision criterion.
Other functions useful in assumption testing:
nice_assumptions
, nice_density
,
nice_normality
, nice_qq
,
nice_varplot
. Tutorial:
https://rempsyc.remi-theriault.com/articles/assumptions
# Make the basic table nice_var( data = iris, variable = "Sepal.Length", group = "Species" ) # Try on multiple variables nice_var( data = iris, variable = names(iris[1:4]), group = "Species" )
# Make the basic table nice_var( data = iris, variable = "Sepal.Length", group = "Species" ) # Try on multiple variables nice_var( data = iris, variable = names(iris[1:4]), group = "Species" )
Attempt to visualize variance per group.
nice_varplot( data, variable, group, colours, groups.labels, grid = TRUE, shapiro = FALSE, ytitle = variable )
nice_varplot( data, variable, group, colours, groups.labels, grid = TRUE, shapiro = FALSE, ytitle = variable )
data |
The data frame |
variable |
The dependent variable to be plotted. |
group |
The group by which to plot the variable. |
colours |
Desired colours for the plot, if desired. |
groups.labels |
How to label the groups. |
grid |
Logical, whether to keep the default background grid or not. APA style suggests not using a grid in the background, though in this case some may find it useful to more easily estimate the slopes of the different groups. |
shapiro |
Logical, whether to include the p-value from the Shapiro-Wilk test on the plot. |
ytitle |
An optional y-axis label, if desired. |
A scatter plot of class ggplot attempting to display the group variances. Also includes the max variance ratio (maximum variance divided by the minimum variance).
Other functions useful in assumption testing:
nice_assumptions
, nice_density
,
nice_normality
, nice_qq
,
nice_var
. Tutorial:
https://rempsyc.remi-theriault.com/articles/assumptions
# Make the basic plot nice_varplot( data = iris, variable = "Sepal.Length", group = "Species" ) # Further customization nice_varplot( data = iris, variable = "Sepal.Length", group = "Species", colours = c( "#00BA38", "#619CFF", "#F8766D" ), ytitle = "Sepal Length", groups.labels = c( "(a) Setosa", "(b) Versicolor", "(c) Virginica" ) )
# Make the basic plot nice_varplot( data = iris, variable = "Sepal.Length", group = "Species" ) # Further customization nice_varplot( data = iris, variable = "Sepal.Length", group = "Species", colours = c( "#00BA38", "#619CFF", "#F8766D" ), ytitle = "Sepal Length", groups.labels = c( "(a) Setosa", "(b) Versicolor", "(c) Virginica" ) )
Make nice violin plots easily with 95% (possibly bootstrapped) confidence intervals.
nice_violin( data, response, group = NULL, boot = FALSE, bootstraps = 2000, colours, xlabels = NULL, ytitle = response, xtitle = NULL, has.ylabels = TRUE, has.xlabels = TRUE, comp1 = 1, comp2 = 2, signif_annotation = NULL, signif_yposition = NULL, signif_xmin = NULL, signif_xmax = NULL, ymin, ymax, yby = 1, CIcap.width = 0.1, obs = FALSE, alpha = 1, border.colour = "black", border.size = 2, has.d = FALSE, d.x = mean(c(comp1, comp2)) * 1.1, d.y = mean(data[[response]]) * 1.3, groups.order = "none", xlabels.angle = 0 )
nice_violin( data, response, group = NULL, boot = FALSE, bootstraps = 2000, colours, xlabels = NULL, ytitle = response, xtitle = NULL, has.ylabels = TRUE, has.xlabels = TRUE, comp1 = 1, comp2 = 2, signif_annotation = NULL, signif_yposition = NULL, signif_xmin = NULL, signif_xmax = NULL, ymin, ymax, yby = 1, CIcap.width = 0.1, obs = FALSE, alpha = 1, border.colour = "black", border.size = 2, has.d = FALSE, d.x = mean(c(comp1, comp2)) * 1.1, d.y = mean(data[[response]]) * 1.3, groups.order = "none", xlabels.angle = 0 )
data |
The data frame. |
response |
The dependent variable to be plotted. |
group |
The group by which to plot the variable. |
boot |
Logical, whether to use bootstrapping for the confidence interval or not. |
bootstraps |
How many bootstraps to use. |
colours |
Desired colours for the plot, if desired. |
xlabels |
The individual group labels on the x-axis. |
ytitle |
An optional y-axis label, if desired. |
xtitle |
An optional x-axis label, if desired. |
has.ylabels |
Logical, whether the x-axis should have labels or not. |
has.xlabels |
Logical, whether the y-axis should have labels or not. |
comp1 |
The first unit of a pairwise comparison, if the
goal is to compare two groups. Automatically displays |
comp2 |
The second unit of a pairwise comparison, if
the goal is to compare two groups. Automatically displays
"", "", or "" depending on significance of the
difference. Can take either a numeric value (based on the
group number) or the name of the group directly. Must be
provided along with argument |
signif_annotation |
Manually provide the required
annotations/numbers of stars (as character strings).
Useful if the automatic pairwise comparison annotation
does not work as expected, or yet if one wants more than
one pairwise comparison. Must be provided along with
arguments |
signif_yposition |
Manually provide the vertical position of the annotations/stars, based on the y-scale. |
signif_xmin |
Manually provide the first part of the horizontal position of the annotations/stars (start of the left-sided bracket), based on the x-scale. |
signif_xmax |
Manually provide the second part of the horizontal position of the annotations/stars (end of the right-sided bracket), based on the x-scale. |
ymin |
The minimum score on the y-axis scale. |
ymax |
The maximum score on the y-axis scale. |
yby |
How much to increase on each "tick" on the y-axis scale. |
CIcap.width |
The width of the confidence interval cap. |
obs |
Logical, whether to plot individual observations or not.
The type of plotting can also be specified, either |
alpha |
The transparency of the plot. |
border.colour |
The colour of the violins border. |
border.size |
The size of the violins border. |
has.d |
Whether to display the d-value. |
d.x |
The x-axis coordinates for the d-value. |
d.y |
The y-axis coordinates for the d-value. |
groups.order |
How to order the group factor levels on the x-axis. Either "increasing" or "decreasing", to order based on the value of the variable on the y axis, or "string.length", to order from the shortest to the longest string (useful when working with long string names). "Defaults to "none". |
xlabels.angle |
How much to tilt the labels of the x-axis. Useful when working with long string names. "Defaults to 0. |
Using boot = TRUE
uses bootstrapping (for the
confidence intervals only) with the BCa method, using
the rcompanion_groupwiseMean function.
For the easystats equivalent, see: see::geom_violindot()
.
A violin plot of class ggplot, by group.
Visualize group differences via scatter plots:
nice_scatter
. Tutorial:
https://rempsyc.remi-theriault.com/articles/violin
# Make the basic plot nice_violin( data = ToothGrowth, response = "len" ) # Save a high-resolution image file to specified directory ggplot2::ggsave("niceviolinplothere.pdf", width = 7, height = 7, unit = "in", dpi = 300 ) # change for your own desired path # Change x- and y- axes labels nice_violin( data = ToothGrowth, group = "dose", response = "len", ytitle = "Length of Tooth", xtitle = "Vitamin C Dosage" ) # See difference between two groups nice_violin( data = ToothGrowth, group = "dose", response = "len", comp1 = "0.5", comp2 = "2" ) nice_violin( data = ToothGrowth, group = "dose", response = "len", comp1 = 2, comp2 = 3 ) # Compare all three groups nice_violin( data = ToothGrowth, group = "dose", response = "len", signif_annotation = c("*", "**", "***"), # manually enter the number of stars signif_yposition = c(30, 35, 40), # What height (y) should the stars appear signif_xmin = c(1, 2, 1), # Where should the left-sided brackets start (x) signif_xmax = c(2, 3, 3) ) # Where should the right-sided brackets end (x) # Set the colours manually nice_violin( data = ToothGrowth, group = "dose", response = "len", colours = c("darkseagreen", "cadetblue", "darkslateblue") ) # Changing the names of the x-axis labels nice_violin( data = ToothGrowth, group = "dose", response = "len", xlabels = c("Low", "Medium", "High") ) # Removing the x-axis or y-axis titles nice_violin( data = ToothGrowth, group = "dose", response = "len", ytitle = NULL, xtitle = NULL ) # Removing the x-axis or y-axis labels (for whatever purpose) nice_violin( data = ToothGrowth, group = "dose", response = "len", has.ylabels = FALSE, has.xlabels = FALSE ) # Set y-scale manually nice_violin( data = ToothGrowth, group = "dose", response = "len", ymin = 5, ymax = 35, yby = 5 ) # Plotting individual observations nice_violin( data = ToothGrowth, group = "dose", response = "len", obs = TRUE ) # Micro-customizations nice_violin( data = ToothGrowth, group = "dose", response = "len", CIcap.width = 0, alpha = .70, border.size = 1, border.colour = "white", comp1 = 1, comp2 = 2, has.d = TRUE )
# Make the basic plot nice_violin( data = ToothGrowth, response = "len" ) # Save a high-resolution image file to specified directory ggplot2::ggsave("niceviolinplothere.pdf", width = 7, height = 7, unit = "in", dpi = 300 ) # change for your own desired path # Change x- and y- axes labels nice_violin( data = ToothGrowth, group = "dose", response = "len", ytitle = "Length of Tooth", xtitle = "Vitamin C Dosage" ) # See difference between two groups nice_violin( data = ToothGrowth, group = "dose", response = "len", comp1 = "0.5", comp2 = "2" ) nice_violin( data = ToothGrowth, group = "dose", response = "len", comp1 = 2, comp2 = 3 ) # Compare all three groups nice_violin( data = ToothGrowth, group = "dose", response = "len", signif_annotation = c("*", "**", "***"), # manually enter the number of stars signif_yposition = c(30, 35, 40), # What height (y) should the stars appear signif_xmin = c(1, 2, 1), # Where should the left-sided brackets start (x) signif_xmax = c(2, 3, 3) ) # Where should the right-sided brackets end (x) # Set the colours manually nice_violin( data = ToothGrowth, group = "dose", response = "len", colours = c("darkseagreen", "cadetblue", "darkslateblue") ) # Changing the names of the x-axis labels nice_violin( data = ToothGrowth, group = "dose", response = "len", xlabels = c("Low", "Medium", "High") ) # Removing the x-axis or y-axis titles nice_violin( data = ToothGrowth, group = "dose", response = "len", ytitle = NULL, xtitle = NULL ) # Removing the x-axis or y-axis labels (for whatever purpose) nice_violin( data = ToothGrowth, group = "dose", response = "len", has.ylabels = FALSE, has.xlabels = FALSE ) # Set y-scale manually nice_violin( data = ToothGrowth, group = "dose", response = "len", ymin = 5, ymax = 35, yby = 5 ) # Plotting individual observations nice_violin( data = ToothGrowth, group = "dose", response = "len", obs = TRUE ) # Micro-customizations nice_violin( data = ToothGrowth, group = "dose", response = "len", CIcap.width = 0, alpha = .70, border.size = 1, border.colour = "white", comp1 = 1, comp2 = 2, has.d = TRUE )
Interpolating the Inclusion of the Other in the Self Scale (IOS; self-other merging) easily. The user provides the IOS score, from 1 to 7, and the function will provide a percentage of actual area overlap between the two circles (i.e., not linear overlap), so it is possible to say, e.g., that experimental group 1 had an average overlap of X% with the other person, whereas experimental group 2 had an average overlap of X% with the other person.
overlap_circle(response, categories = c("Self", "Other"), scoring = "IOS")
overlap_circle(response, categories = c("Self", "Other"), scoring = "IOS")
response |
The variable to plot: requires IOS scores ranging
from 1 to 7 (when |
categories |
The desired category names of the two overlapping circles for display on the plot. |
scoring |
One of |
The circles are generated through the
VennDiagram::draw.pairwise.venn()
function and the desired
percentage overlap is passed to its cross.area
argument
("The size of the intersection between the sets"). The percentage
overlap values are interpolated from this reference grid:
Score of 1 = 0%, 2 = 10%, 3 = 20%, 4 = 30%, 5 = 55%, 6 = 65%,
7 = 85%.
A plot of class gList, displaying overlapping circles relative to the selected score.
Tutorial: https://rempsyc.remi-theriault.com/articles/circles
For a javascript web plugin of a continuous version of the Inclusion of Other in the Self (IOS) task (instead of the pen and paper version), for experiments during data collection, rather than data analysis, please see: https://github.com/jspsych/jspsych-contrib/tree/main/packages/plugin-ios
# Score of 1 (0% overlap) overlap_circle(1) # Score of 3.5 (25% overlap) overlap_circle(3.5) # Score of 6.84 (81.8% overlap) overlap_circle(6.84) # Changing labels overlap_circle(3.12, categories = c("Humans", "Animals")) # Saving to file (PDF or PNG) plot <- overlap_circle(3.5) ggplot2::ggsave(plot, file = tempfile(fileext = ".pdf"), width = 7, height = 7, unit = "in", dpi = 300 ) # Change for your own desired path
# Score of 1 (0% overlap) overlap_circle(1) # Score of 3.5 (25% overlap) overlap_circle(3.5) # Score of 6.84 (81.8% overlap) overlap_circle(6.84) # Changing labels overlap_circle(3.12, categories = c("Humans", "Animals")) # Saving to file (PDF or PNG) plot <- overlap_circle(3.5) ggplot2::ggsave(plot, file = tempfile(fileext = ".pdf"), width = 7, height = 7, unit = "in", dpi = 300 ) # Change for your own desired path
Make nice scatter plots over multiple times (T1, T2, T3) easily.
plot_means_over_time( data, response, group, groups.order = "none", error_bars = TRUE, ytitle = NULL, legend.title = "", significance_stars, significance_stars_x, significance_stars_y, significance_bars_x, print_table = FALSE, verbose = FALSE )
plot_means_over_time( data, response, group, groups.order = "none", error_bars = TRUE, ytitle = NULL, legend.title = "", significance_stars, significance_stars_x, significance_stars_y, significance_bars_x, print_table = FALSE, verbose = FALSE )
data |
The data frame. |
response |
The dependent variable to be plotted (e.g.,
|
group |
The group by which to plot the variable |
groups.order |
Specifies the desired display order of the groups on the legend. Either provide the levels directly, or a string: "increasing" or "decreasing", to order based on the average value of the variable on the y axis, or "string.length", to order from the shortest to the longest string (useful when working with long string names). "Defaults to "none". |
error_bars |
Logical, whether to include 95% confidence intervals for means. |
ytitle |
An optional x-axis label, if desired. If |
legend.title |
The desired legend title. |
significance_stars |
Vetor of significance stars to display on the
plot (e.g,. |
significance_stars_x |
Vector of where on the x-axis significance
stars should appear on the plot (e.g., |
significance_stars_y |
Vector of where on the y-axis significance
stars should appear on the plot. The logic here is different than previous
arguments. Rather than providing actual coordinates, we provide a list
object with structure group 1, group 2, and time of comparison, e.g.,
|
significance_bars_x |
Vector of where on the x-axis vertical
significance bars should appear on the plot (e.g., |
print_table |
Logical, whether to also print the computed table. |
verbose |
Logical, whether to also print a note regarding the meaning of the error bars. |
Error bars are calculated using the method of Morey (2008) through
Rmisc::summarySEwithin()
, but raw means are plotted instead of the normed
means. For more information, visit:
http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2).
A scatter plot of class ggplot.
Morey, R. D. (2008). Confidence intervals from normalized data: A correction to Cousineau (2005). Tutorials in Quantitative Methods for Psychology, 4(2), 61-64. doi:10.20982/tqmp.04.2.p061
data <- mtcars names(data)[6:3] <- paste0("T", 1:4, "_var") plot_means_over_time( data = data, response = names(data)[6:3], group = "cyl", groups.order = "decreasing" ) # Add significance stars/bars plot_means_over_time( data = data, response = names(data)[6:3], group = "cyl", significance_bars_x = c(3.15, 4.15), significance_stars = c("*", "***"), significance_stars_x = c(3.25, 4.5), significance_stars_y = list(c("4", "8", time = 3), c("4", "8", time = 4))) # significance_stars_y: List with structure: list(c("group1", "group2", time))
data <- mtcars names(data)[6:3] <- paste0("T", 1:4, "_var") plot_means_over_time( data = data, response = names(data)[6:3], group = "cyl", groups.order = "decreasing" ) # Add significance stars/bars plot_means_over_time( data = data, response = names(data)[6:3], group = "cyl", significance_bars_x = c(3.15, 4.15), significance_stars = c("*", "***"), significance_stars_x = c(3.25, 4.5), significance_stars_y = list(c("4", "8", time = 3), c("4", "8", time = 4))) # significance_stars_y: List with structure: list(c("group1", "group2", time))
Easily and visually check outliers through a dot plot with accompanying reference lines at +/- 3 MAD or SD. When providing a group, data are group-mean centered and standardized (based on MAD or SD); if no group is provided, data are simply standardized.
plot_outliers( data, group = NULL, response, method = "mad", criteria = 3, colours, xlabels = NULL, ytitle = NULL, xtitle = NULL, has.ylabels = TRUE, has.xlabels = TRUE, ymin, ymax, yby = 1, ... )
plot_outliers( data, group = NULL, response, method = "mad", criteria = 3, colours, xlabels = NULL, ytitle = NULL, xtitle = NULL, has.ylabels = TRUE, has.xlabels = TRUE, ymin, ymax, yby = 1, ... )
data |
The data frame. |
group |
The group by which to plot the variable. |
response |
The dependent variable to be plotted. |
method |
Method to identify outliers, either (e.g., 3) median absolute deviations ("mad", default) or standard deviations ("sd"). |
criteria |
How many MADs (or standard deviations) to use as threshold (default is 3). |
colours |
Desired colours for the plot, if desired. |
xlabels |
The individual group labels on the x-axis. |
ytitle |
An optional y-axis label, if desired. |
xtitle |
An optional x-axis label, if desired. |
has.ylabels |
Logical, whether the x-axis should have labels or not. |
has.xlabels |
Logical, whether the y-axis should have labels or not. |
ymin |
The minimum score on the y-axis scale. |
ymax |
The maximum score on the y-axis scale. |
yby |
How much to increase on each "tick" on the y-axis scale. |
... |
Other arguments passed to ggplot2::geom_dotplot. |
A dot plot of class ggplot, by group.
Other functions useful in assumption testing: Tutorial: https://rempsyc.remi-theriault.com/articles/assumptions
# Make the basic plot plot_outliers( airquality, group = "Month", response = "Ozone" ) plot_outliers( airquality, response = "Ozone", method = "sd" )
# Make the basic plot plot_outliers( airquality, group = "Month", response = "Ozone" ) plot_outliers( airquality, response = "Ozone", method = "sd" )
Scale and center ("standardize") data based on the median absolute deviation (MAD).
scale_mad(x)
scale_mad(x)
x |
The vector to be scaled. |
The function subtracts the median to each observation, and then divides the outcome by the MAD. This is analogous to regular standardization which subtracts the mean to each observaion, and then divides the outcome by the standard deviation.
For the easystats equivalent, use:
datawizard::standardize(x, robust = TRUE)
.
A numeric vector of standardized data.
Leys, C., Ley, C., Klein, O., Bernard, P., & Licata, L. (2013). Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median. Journal of Experimental Social Psychology, 49(4), 764–766. https://doi.org/10.1016/j.jesp.2013.03.013
scale_mad(mtcars$mpg)
scale_mad(mtcars$mpg)
Winsorize (bring extreme observations to usually +/- 3 standard deviations) data based on median absolute deviations instead of standard deviations.
winsorize_mad(x, criteria = 3)
winsorize_mad(x, criteria = 3)
x |
The vector to be winsorized based on the MAD. |
criteria |
How many MAD to use as threshold (similar to standard deviations) |
For the easystats equivalent, use:
datawizard::winsorize(x, method = "zscore", threshold = 3, robust = TRUE)
.
A numeric vector of winsorized data.
Leys, C., Ley, C., Klein, O., Bernard, P., & Licata, L. (2013). Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median. Journal of Experimental Social Psychology, 49(4), 764–766. https://doi.org/10.1016/j.jesp.2013.03.013
winsorize_mad(mtcars$qsec, criteria = 2)
winsorize_mad(mtcars$qsec, criteria = 2)