Help for package saros

Type:

Package

Title:

Semi-Automatic Reporting of Ordinary Surveys

Version:

1.6.0

Maintainer:

Stephan Daus <stephus.daus@gmail.com>

Description:

Offers a systematic way for conditional reporting of figures and tables for many (and bivariate combinations of) variables, typically from survey data. Contains interactive 'ggiraph'-based (https://CRAN.R-project.org/package=ggiraph) plotting functions and data frame-based summary tables (bivariate significance tests, frequencies/proportions, unique open ended responses, etc) with many arguments for customization, and extensions possible. Uses a global options() system for neatly reducing redundant code. Also contains tools for immediate saving of objects and returning a hashed link to the object, useful for creating download links to high resolution images upon rendering in 'Quarto'. Suitable for highly customized reports, primarily intended for survey research.

Note:

Free to use for non-Norwegian institutions, otherwise see LICENSE.

License:

MIT + file LICENSE

URL:

https://nifu-no.github.io/saros/, https://github.com/NIFU-NO/saros

BugReports:

https://github.com/NIFU-NO/saros/issues

Depends:

R (≥ 4.2.0)

Imports:

cli, dplyr, forcats, fs, ggiraph, ggplot2, glue, grDevices, lifecycle, mschart, officer, rlang, stringi, stats, tidyr, tidyselect, utils, vctrs

Suggests:

covr, haven, labelled, quarto, knitr, readr, scales, spelling, srvyr, survey, testthat (≥ 3.0.0), tibble, vdiffr, withr, writexl, readxl

Config/testthat/edition:

Encoding:

UTF-8

RoxygenNote:

7.3.3

Language:

en-US

VignetteBuilder:

quarto

Config/Needs/website:

rmarkdown

Config/testthat/parallel:

true

LazyData:

true

NeedsCompilation:

Packaged:

2025-11-10 11:32:55 UTC; py128

Author:

Stephan Daus

[aut, cre, cph], Julia Silge [ctb] (Author of internal scale_x_reordered), David Robinson [ctb] (Author of internal scale_x_reordered), Nordic Institute for The Studies of Innovation, Research and Education (NIFU) [fnd], Kristiania University College [fnd]

Repository:

CRAN

Date/Publication:

2025-11-10 12:00:02 UTC

saros: Semi-Automatic Reporting of Ordinary Surveys

Description

Author(s)

Maintainer: Stephan Daus stephus.daus@gmail.com (ORCID) [copyright holder]

Other contributors:

Julia Silge (Author of internal scale_x_reordered) [contributor]
David Robinson (Author of internal scale_x_reordered) [contributor]
Nordic Institute for The Studies of Innovation, Research and Education (NIFU) [funder]
Kristiania University College [funder]

Add response category ordering (only useful for long format cat-cat tables)

Description

Add response category ordering (only useful for long format cat-cat tables)

Usage

add_category_order(data, sort_by = NULL)

Arguments

data

Dataset

sort_by

Sorting method for response categories

Value

Dataset with .category_order column added

Add dependent variable ordering

Description

Add dependent variable ordering

Usage

add_dep_order(data, sort_by, descend = FALSE)

Arguments

data

Dataset

sort_by

Sorting method for dependent variables

descend

Whether to reverse the order

Value

Dataset with .dep_order column added

Add independent variable category ordering

Description

Add independent variable category ordering

Usage

add_indep_order(data, sort_by = ".factor_order", descend = FALSE)

Arguments

data

Dataset

sort_by

Sorting method for independent categories (NULL = no sorting)

descend

Whether to reverse the order

Value

Dataset with .indep_order column added

Create sorting order variables for output dataframe

Description

This module provides centralized sorting functionality to ensure consistent ordering across all output types (tables, plots) by using explicit order columns instead of relying on factor levels that can be overridden. Apply comprehensive sorting order to survey data

Usage

add_sorting_order_vars(
  data,
  sort_dep_by = ".variable_position",
  sort_indep_by = ".factor_order",
  sort_category_by = NULL,
  descend = FALSE,
  descend_indep = FALSE
)

Arguments

data

Dataset with survey results

sort_dep_by

How to sort dependent variables

sort_indep_by

How to sort independent variable categories

sort_category_by

How to sort response categories

descend

Whether to reverse the dependent variable order

Value

Dataset with added order columns: .dep_order, .indep_order, .category_order

Apply final arrangement based on order columns

Description

Apply final arrangement based on order columns

Usage

apply_final_arrangement(data)

Arguments

data

Dataset with order columns

Value

Arranged dataset

Apply label wrapping based on plot layout

Description

Helper function to consistently wrap variable labels based on whether they appear on facet strips or x-axis, and whether inverse layout is used.

Usage

apply_label_wrapping(
  data,
  indep_length,
  inverse,
  strip_width,
  x_axis_label_width
)

Arguments

data

Data frame containing .variable_label column

indep_length

Number of independent variables (0 or 1)

inverse

Logical, whether inverse layout is used

strip_width

Width for facet strip labels

x_axis_label_width

Width for x-axis labels

Value

Data frame with wrapped .variable_label column

Apply legacy sorting adjustments for special cases

Description

Apply legacy sorting adjustments for special cases

Usage

apply_legacy_sorting_adjustments(
  data,
  indep_names = character(0),
  translations = list()
)

Arguments

data

Dataset

indep_names

Independent variable names

translations

Translation strings

Value

Dataset with legacy adjustments applied

Arrange output data by prespecified orders

Description

Standard data arrangement for table functions

Usage

arrange_table_data(data, col_basis, indep_vars = NULL)

Arguments

data

Data frame to arrange

col_basis

Column to use as primary sort

indep_vars

Independent variable columns

Value

Arranged data frame

Apply sorting with optional descending order

Description

Unified helper to consistently handle ascending/descending sort order across all sorting functions.

Usage

arrange_with_order(data, order_col, descend = FALSE)

Arguments

data

Dataset to arrange

order_col

Symbol/name of the column to sort by

descend

Whether to sort in descending order

Value

Arranged dataset

Calculate ordering based on a specific category value

Description

Calculate ordering based on a specific category value

Usage

calculate_category_order(data, category_value, descend = FALSE)

Arguments

data

Dataset with .category and .count columns

category_value

The category value to sort by (e.g., "A bit")

descend

Logical indicating if sorting should be descending

Value

Numeric vector of ordering values

Calculate ordering based on a specific column value

Description

Calculate ordering based on a specific column value

Usage

calculate_column_order(data, column_name, descend = FALSE)

Arguments

data

Dataset

column_name

Name of the column to sort by

descend

Logical indicating if sorting should be descending

Value

Numeric vector of ordering values

Calculate independent variable ordering based on a specific category value

Description

Calculate independent variable ordering based on a specific category value

Usage

calculate_indep_category_order(
  data,
  category_value,
  indep_col,
  descend_indep = FALSE
)

Arguments

data

Dataset with independent variable columns

category_value

The category value to sort by (e.g., "Not at all")

indep_col

Name of the independent variable column

descend_indep

Logical indicating if sorting should be descending

Value

Numeric vector of ordering values

Calculate independent variable ordering based on a specific column value

Description

Calculate independent variable ordering based on a specific column value

Usage

calculate_indep_column_order(
  data,
  column_name,
  indep_col,
  descend_indep = FALSE
)

Arguments

data

Dataset

column_name

Name of the column to sort by

indep_col

Name of the independent variable column

descend_indep

Logical indicating if sorting should be descending

Value

Numeric vector of ordering values

Calculate independent variable ordering based on position categories

Description

Calculate independent variable ordering based on position categories

Usage

calculate_indep_proportion_order(
  data,
  method,
  indep_col,
  descend_indep = FALSE
)

Arguments

data

Dataset

method

Either ".upper", ".top", etc.

indep_col

Name of the independent variable column

descend_indep

Logical indicating if sorting should be descending

Value

Numeric vector of ordering values

Calculate independent variable ordering based on multiple category values

Description

Calculate independent variable ordering based on multiple category values

Usage

calculate_indep_sum_value_order(
  data,
  category_values,
  indep_col,
  descend_indep = FALSE
)

Arguments

data

Dataset

category_values

Vector of category values to sum

indep_col

Name of the independent variable column

descend_indep

Logical indicating if sorting should be descending

Value

Numeric vector of ordering values

Calculate ordering based on multiple category values

Description

Calculate ordering based on multiple category values

Usage

calculate_multiple_category_order(data, category_values, descend = FALSE)

Arguments

data

Dataset with .category and .count columns

category_values

Vector of category values to sum (e.g., c("A bit", "A lot"))

descend

Logical indicating if sorting should be descending

Value

Numeric vector of ordering values

Calculate proportion-based ordering for dependent variables

Description

Calculate proportion-based ordering for dependent variables

Usage

calculate_proportion_order(data, method, descend = FALSE)

Arguments

data

Dataset

method

Either ".upper", ".top", etc.

descend

Logical indicating if sorting should be descending

Value

Numeric vector of ordering values

Calculate ordering based on .sum_value (for category-based sorting)

Description

Calculate ordering based on .sum_value (for category-based sorting)

Usage

calculate_sum_value_order(data, descend = FALSE)

Arguments

data

Dataset with .sum_value column

descend

Logical indicating if sorting should be descending

Value

Numeric vector of ordering values

Convert List of Plots to Quarto Tabset

Description

Creates a Quarto tabset from a named list of ggplot2 objects, typically generated by makeme() with crowd parameter. Each plot becomes a tab with automatic height calculation and optional download links.

Usage

crowd_plots_as_tabset(
  plot_list,
  plot_type = c("cat_plot_html", "int_plot_html", "auto"),
  save = FALSE,
  fig_height = NULL,
  fig_height_int_default = 6
)

Arguments

plot_list

A named list of ggplot2 objects. Names become tab labels. Typically created with makeme(crowd = c("target", "others")).

plot_type

Character. Type of plots in the list. One of:

"cat_plot_html" (default): Categorical horizontal bar charts
"int_plot_html": Interval plots (violin/box plots)
"auto": Auto-detect from first non-NULL plot's data structure

save

Logical. If TRUE (default), generates download links for plot data and images via get_fig_title_suffix_from_ggplot().

fig_height

Numeric or NULL. Manual figure height override in inches. If NULL (default), height is calculated automatically based on plot_type.

fig_height_int_default

Numeric. Default height for interval plots when auto-calculation is not available (default: 6 inches).

Details

This function is designed to be called within a Quarto document code chunk. It generates markdown that creates a tabset where each non-NULL plot in plot_list appears as a separate tab.

Height Calculation:

For "cat_plot_html": Uses fig_height_h_barchart2() which accounts for number of variables, categories, and label lengths
For "int_plot_html": Uses fig_height_int_default (simpler plots need less sophisticated calculation)
For "auto": Detects type by checking for .category column (categorical) vs numeric statistics columns (interval)

Requirements:

Must be run within knitr/Quarto context
Plots should be created with makeme()
Plot list should have meaningful names for tab labels

Value

Invisibly returns NULL. The function's purpose is its side effect of printing Quarto markdown that creates a tabset.

Examples

## Not run: 
# In a Quarto document
plots <- makeme(
  data = ex_survey,
  dep = b_1:b_3,
  crowd = c("target", "others"),
  mesos_var = "f_uni",
  mesos_group = "Uni of A"
)

# Create tabset with auto-detection
crowd_plots_as_tabset(plots)

# Create tabset for interval plots
int_plots <- makeme(
  data = ex_survey,
  dep = c_1:c_2,
  indep = x1_sex,
  type = "int_plot_html",
  crowd = c("target", "others"),
  mesos_var = "f_uni",
  mesos_group = "Uni of A"
)
crowd_plots_as_tabset(int_plots, plot_type = "int_plot_html")

# Without download links
crowd_plots_as_tabset(plots, save = FALSE)

# With manual height override
crowd_plots_as_tabset(plots, fig_height = 8)

## End(Not run)

Detect Variable Types for Dependent and Independent Variables

Description

Internal helper function that examines the class of variables in the subset data to determine their types (factor, numeric, character, etc.).

Usage

detect_variable_types(subset_data, dep_crwd, indep_crwd)

Arguments

subset_data

Data frame subset containing the relevant variables

dep_crwd

Character vector of dependent variable names for current crowd

indep_crwd

Character vector of independent variable names for current crowd

Value

List with two elements:

dep: Character vector of classes for dependent variables
indep: Character vector of classes for independent variables (empty if none)

Determine variable column basis

Description

Consistent logic for determining whether to use .variable_label or .variable_name

Usage

determine_variable_basis(data_summary)

Arguments

data_summary

Data frame with variable information

Value

String indicating column to use as basis

Embed Interactive Categorical Plot (DEPRECATED!)

Description

This function has been deprecated. Use instead makeme()

Usage

embed_cat_prop_plot(
  data,
  ...,
  dep = tidyselect::everything(),
  indep = NULL,
  colour_palette = NULL,
  mesos_group = NULL,
  html_interactive = TRUE,
  inverse = FALSE
)

Arguments

data

data.frame, tibble or potentially a srvyr-object.

...

Dynamic dots, arguments forwarded to underlying function(s).

dep

tidyselect-syntax for dependent variable(s).

indep

tidyselect-syntax for an optional independent variable.

colour_palette

Character vector. Avoid using this.

mesos_group

String

html_interactive

Flag, whether to include interactivity.

inverse

Flag, whether to flip plot or table.

Embed Reactable Table (DEPRECATED!)

Description

This function has been deprecated. Use instead makeme()

Usage

embed_cat_table(
  data,
  ...,
  dep = tidyselect::everything(),
  indep = NULL,
  mesos_group = NULL
)

Arguments

data

data.frame, tibble or potentially a srvyr-object.

...

Dynamic dots, arguments forwarded to underlying function(s).

dep

tidyselect-syntax for dependent variable(s).

indep

tidyselect-syntax for an optional independent variable.

mesos_group

String

Interactive table of text data (DEPRECATED)

Description

This function has been deprecated. Use instead makeme()

Usage

embed_chr_table_html(data, dep, ..., mesos_group = NULL)

Arguments

data

data.frame, tibble or potentially a srvyr-object.

dep

tidyselect-syntax for dependent variable(s).

...

Dynamic dots, arguments forwarded to underlying function(s).

mesos_group

String

Evaluate Variable Selection

Description

Internal helper function that evaluates tidyselect expressions for dependent and independent variables, returning their column positions in the data frame.

Usage

evaluate_variable_selection(data, dep, indep)

Arguments

data

A data frame containing the variables to be selected

dep

Quosure or tidyselect expression for dependent variables

indep

Quosure or tidyselect expression for independent variables

Value

A list with two named elements:

dep_pos: Named integer vector of column positions for dependent variables
indep_pos: Named integer vector of column positions for independent variables

ex_survey: Mockup dataset of a survey.

Description

A dataset containing fake respondents' answers to survey questions. The first two, x_sex and x_human, are intended to be independent variables, whereas the remaining are dependent. The underscore _ in variable names separates item groups (prefix) from items (suffix) (i.e. a_1-a_9 => a + 1-9), whereas ' - ' separates the same for labels. The latter corresponds with the default in SurveyXact.

Usage

ex_survey

Format

A data frame with 100 rows and 29 variables:

x1_sex: Gender
x2_human: Is respondent human?
x3_nationality: Where is the respondent born?
a_1: Do you consent to the following? - Agreement #1
a_2: Do you consent to the following? - Agreement #2
a_3: Do you consent to the following? - Agreement #3
a_4: Do you consent to the following? - Agreement #4
a_5: Do you consent to the following? - Agreement #5
a_6: Do you consent to the following? - Agreement #6
a_7: Do you consent to the following? - Agreement #7
a_8: Do you consent to the following? - Agreement #8
a_9: Do you consent to the following? - Agreement #9
b_1: How much do you like living in - Beijing
b_2: How much do you like living in - Brussels
b_3: How much do you like living in - Budapest
c_1: How many years of experience do you have in - Company A
c_2: How many years of experience do you have in - Company B
d_1: Rate your degree of confidence doing the following - Driving
d_2: Rate your degree of confidence doing the following - Drinking
d_3: Rate your degree of confidence doing the following - Driving
d_4: Rate your degree of confidence doing the following - Dancing
e_1: How often do you do the following? - Eat
e_2: How often do you do the following? - Eavesdrop
e_3: How often do you do the following? - Exercise
e_4: How often do you do the following? - Encourage someone whom you have only recently met and who struggles with simple tasks that they cannot achieve by themselves
p_1: To what extent do you agree or disagree to the following policies - Red Party
p_2: To what extent do you agree or disagree to the following policies - Green Party
p_3: To what extent do you agree or disagree to the following policies - Yellow Party
p_4: To what extent do you agree or disagree to the following policies - Blue Party
f_uni: Which of the following universities would you prefer to study at?
open_comments: Do you have any comments to the survey?
resp_status: Response status

Estimate figure height for a horizontal bar chart

Description

This function estimates the height of a figure for a horizontal bar chart based on several parameters including the number of dependent and independent variables, number of categories, maximum characters in the labels, and legend properties.

Usage

fig_height_h_barchart(
  n_y,
  n_cats_y,
  max_chars_labels_y = 20,
  max_chars_cats_y = 20,
  n_x = NULL,
  n_cats_x = NULL,
  max_chars_labels_x = NULL,
  max_chars_cats_x = NULL,
  freq = FALSE,
  x_axis_label_width = 20,
  strip_width = 20,
  strip_angle = 0,
  main_font_size = 7,
  legend_location = c("plot", "panel"),
  n_legend_lines = NULL,
  legend_key_chars_equivalence = 5,
  multiplier_per_horizontal_line = 1,
  multiplier_per_vertical_letter = 1,
  multiplier_per_facet = 1,
  multiplier_per_bar = 1,
  multiplier_per_legend_line = 1,
  multiplier_per_plot = 1,
  fixed_constant = 0,
  margin_in_cm = 0,
  figure_width_in_cm = 14,
  max = 12,
  min = 2,
  hide_axis_text_if_single_variable = FALSE,
  add_n_to_dep_label = FALSE,
  add_n_to_indep_label = FALSE,
  showNA = c("ifany", "never", "always")
)

Arguments

n_y, n_x

Integer. Number of dependent/independent variables.

n_cats_y

Integer. Number of categories across the dependent variables.

max_chars_labels_y

Integer. Maximum number of characters across the dependent variables' labels.

max_chars_cats_y

Integer. Maximum number of characters across the dependent variables' response categories (levels).

n_cats_x

Integer or NULL. Number of categories across the independent variables.

max_chars_labels_x

Integer or NULL. Maximum number of characters across the independent variables' labels.

max_chars_cats_x

Integer or NULL. Maximum number of characters across the independent variables' response categories (levels).

freq

Logical. If TRUE, frequency plot with categories next to each other. If FALSE (default), proportion plot with stacked categories.

x_axis_label_width, strip_width

Numeric. Width allocated for x-axis labels and strip labels respectively.

strip_angle

Integer. Angle of the strip text.

main_font_size

Numeric. Font size for the main text.

legend_location

Character. Location of the legend. "plot" (default) or "panel".

n_legend_lines

Integer. Number of lines in the legend.

legend_key_chars_equivalence

Integer. Approximate number of characters the legend key equals.

multiplier_per_horizontal_line

Numeric. Multiplier per horizontal line.

multiplier_per_vertical_letter

Numeric. Multiplier per vertical letter.

multiplier_per_facet

Numeric. Multiplier per facet height.

multiplier_per_bar

Numeric. Multiplier per bar height (thickness).

multiplier_per_legend_line

Numeric. Multiplier per legend line.

multiplier_per_plot

Numeric. Multiplier for entire plot estimates.

fixed_constant

Numeric. Fixed constant to be added to the height.

margin_in_cm

Numeric. Margin in centimeters.

figure_width_in_cm

Numeric. Width of the figure in centimeters.

max

Numeric. Maximum height.

min

Numeric. Minimum height.

hide_axis_text_if_single_variable

Boolean. Whether the label is hidden for single dependent variable plots.

add_n_to_dep_label, add_n_to_indep_label

Boolean. If TRUE, will add 10 characters to the max label lengths. This is primarily useful when obtaining these settings from the global environment, avoiding the need to compute this for each figure chunk.

showNA

String, one of "ifany", "always" or "never". Not yet in use.

Value

Numeric value representing the estimated height of the figure.

Examples

fig_height_h_barchart(
  n_y = 5,
  n_cats_y = 3,
  max_chars_labels_y = 20,
  max_chars_cats_y = 8,
  n_x = 1,
  n_cats_x = 4,
  max_chars_labels_x = 12,
  freq = FALSE,
  x_axis_label_width = 20,
  strip_angle = 0,
  main_font_size = 8,
  legend_location = "panel",
  n_legend_lines = 2,
  legend_key_chars_equivalence = 5,
  multiplier_per_horizontal_line = 1,
  multiplier_per_vertical_letter = .15,
  multiplier_per_facet = .95,
  multiplier_per_legend_line = 1.5,
  figure_width_in_cm = 16
)

Estimate figure height for a horizontal bar chart

Description

Taking an object from makeme(), this function estimates the height of a figure for a horizontal bar chart.

Usage

fig_height_h_barchart2(
  ggobj,
  main_font_size = 7,
  strip_angle = 0,
  freq = FALSE,
  x_axis_label_width = 20,
  strip_width = 20,
  legend_location = c("plot", "panel"),
  n_legend_lines = NULL,
  showNA = c("ifany", "never", "always"),
  legend_key_chars_equivalence = 5,
  multiplier_per_horizontal_line = NULL,
  multiplier_per_vertical_letter = 1,
  multiplier_per_facet = 1,
  multiplier_per_legend_line = 1,
  fixed_constant = 0,
  figure_width_in_cm = 14,
  margin_in_cm = 0,
  max = 8,
  min = 1
)

Arguments

ggobj

ggplot2-object

main_font_size

Numeric. Font size for the main text.

strip_angle

Integer. Angle of the strip text.

freq

Logical. If TRUE, frequency plot with categories next to each other. If FALSE (default), proportion plot with stacked categories.

x_axis_label_width, strip_width

Numeric. Width allocated for x-axis labels and strip labels respectively.

legend_location

Character. Location of the legend. "plot" (default) or "panel".

n_legend_lines

Integer. Number of lines in the legend.

showNA

String, one of "ifany", "always" or "never". Not yet in use.

legend_key_chars_equivalence

Integer. Approximate number of characters the legend key equals.

multiplier_per_horizontal_line

Numeric. Multiplier per horizontal line.

multiplier_per_vertical_letter

Numeric. Multiplier per vertical letter.

multiplier_per_facet

Numeric. Multiplier per facet height.

multiplier_per_legend_line

Numeric. Multiplier per legend line.

fixed_constant

Numeric. Fixed constant to be added to the height.

figure_width_in_cm

Numeric. Width of the figure in centimeters.

margin_in_cm

Numeric. Margin in centimeters.

max

Numeric. Maximum height.

min

Numeric. Minimum height.

Value

Numeric value representing the estimated height of the figure.

Examples

fig_height_h_barchart2(makeme(data = ex_survey, dep = b_1:b_2, indep = x1_sex))

Filter and Prepare Data for a Specific Crowd

Description

Internal helper function that filters data for a specific crowd identifier, applying variable exclusions and category filtering as needed.

Usage

filter_crowd_data(data, args, crwd, omitted_cols_list, kept_indep_cats_list)

Arguments

data

Data frame being analyzed

args

List of makeme function arguments

crwd

Character string identifying the current crowd

omitted_cols_list

Named list of omitted variables for each crowd

kept_indep_cats_list

Named list of kept independent categories for each crowd

Details

Applies the following filtering steps:

Removes omitted variables based on hiding criteria
Filters rows to match crowd membership
Applies independent category filtering if enabled
Returns NULL and warns if no data remains after filtering

Value

List with subset data and variables for the crowd, or NULL if no data remains:

subset_data: Filtered data frame for the crowd
dep_crwd: Character vector of dependent variables for this crowd
indep_crwd: Character vector of independent variables for this crowd

Generate Output for a Specific Crowd

Description

Internal helper function that generates the final output object for a crowd by processing data summary and calling the appropriate make_content function.

Usage

generate_crowd_output(args, subset_data, dep_crwd, indep_crwd)

Arguments

args

List of makeme function arguments

subset_data

Data frame subset for the current crowd

dep_crwd

Character vector of dependent variable names for current crowd

indep_crwd

Character vector of independent variable names for current crowd

Details

Processing steps:

Summarizes data using summarize_data_by_type()
Sets main question from variable labels
Post-processes data summary for most types
Calls make_content() to generate final output

Value

Output object (type depends on makeme type):

Could be plot, table, or other analysis object
Generated by make_content() with crowd-specific arguments

Generate Appropriate Data Summary Based on Variable Types

Description

Internal helper function that routes to the appropriate data summarization function based on the detected variable types (categorical vs continuous).

Usage

generate_data_summary(
  variable_types,
  subset_data,
  dep_crwd,
  indep_crwd,
  args,
  ...
)

Arguments

variable_types

List with dep and indep variable type information

subset_data

Data frame subset for the current crowd

dep_crwd

Character vector of dependent variable names for current crowd

indep_crwd

Character vector of independent variable names for current crowd

args

List of makeme function arguments

...

Additional arguments passed to summarization functions

Value

Data summary object (type depends on variable types):

For integer/numeric dep + factor/character indep: calls summarize_int_cat_data()
For factor/character dep: calls summarize_cat_cat_data()
For mixed types: throws error

Provide A Colour Set for A Number of Requested Colours

Description

Possibly using colour_palette_nominal if available. If not sufficient, uses a set palette from RColorBrewer.

Usage

get_colour_palette(
  data,
  col_pos,
  colour_palette_nominal = NULL,
  colour_palette_ordinal = NULL,
  colour_na = NULL,
  categories_treated_as_na = NULL,
  call = rlang::caller_env()
)

Arguments

data

Your data.frame/tibble or srvyr-object (experimental)

data.frame // required

The data to be used for plotting.

col_pos

Character vector of column names for which colours will be found.

colour_palette_nominal, colour_palette_ordinal

User specified colour set

⁠vector<character>⁠ // default: NULL (optional)

User-supplied default palette, excluding colour_na.

colour_na

Colour for NA category

⁠scalar<character>⁠ // default: NULL (optional)

Colour as a single string for NA values, if showNA is "ifany" or "always".

categories_treated_as_na

NA categories

⁠vector<character>⁠ // default: NULL (optional)

Categories that should be treated as NA.

call

Internal call

⁠obj:<call>⁠ // Default: rlang::caller_env() (optional)

Both the absolute and relative folderpaths are required, as strings.

Value

A colour set as character vector, where NA has the colour_na, and the rest are taken from colour_palette_nominal if available.

Provide A Colour Set for A Number of Requested Colours

Description

Possibly using colour_palette_nominal if available. If not sufficient, uses a set palette from RColorBrewer.

Usage

get_colour_set(
  x,
  common_data_type = "factor",
  colour_palette_nominal = NULL,
  colour_palette_ordinal = NULL,
  colour_na = NULL,
  colour_2nd_binary_cat = NULL,
  ordinal = FALSE,
  categories_treated_as_na = NULL,
  call = rlang::caller_env()
)

Arguments

x

Vector for which colours will be found.

common_data_type

factor or ordered data type

⁠scalar<character>⁠ // default: factor (optional)

Currently only supports factor and ordered.

colour_palette_nominal, colour_palette_ordinal

User specified colour set

⁠vector<character>⁠ // default: NULL (optional)

User-supplied default palette, excluding colour_na.

colour_na

Colour for NA category

⁠scalar<character>⁠ // default: NULL (optional)

Colour as a single string for NA values, if showNA is "ifany" or "always".

colour_2nd_binary_cat

Colour for second binary category

⁠scalar<character>⁠ // default: "#ffffff" (optional)

Colour for the second category in binary variables. Often useful to hide this.

ordinal

⁠scalar<logical>⁠ // default: FALSE (optional)

Is palette ordinal?

categories_treated_as_na

NA categories

⁠vector<character>⁠ // default: NULL (optional)

Categories that should be treated as NA.

call

Internal call

⁠obj:<call>⁠ // Default: rlang::caller_env() (optional)

Both the absolute and relative folderpaths are required, as strings.

Value

A colour set as character vector, where NA has the colour_na, and the rest are taken from colour_palette_nominal if available.

Determine display column based on data availability

Description

Checks if .variable_label column exists and has non-NA values to determine whether to use .variable_label or .variable_name for display.

Usage

get_data_display_column(data)

Arguments

data

Data frame containing variable information

Value

Character string indicating which column to use

Get Valid Data Labels for Figures and Tables

Description

Get Valid Data Labels for Figures and Tables

Usage

get_data_label_opts()

Value

Character vector

Determine display column for dependent variables in int_plot_html

Description

Checks if the number of dep variables matches the number of labels to determine whether to use .variable_label or .variable_name for display.

Usage

get_dep_display_column(dep_count, dep_labels)

Arguments

dep_count

Number of dependent variables

dep_labels

Vector of dependency labels

Value

Character string indicating which column to use

Generate Figure Title Suffix with N Range and Optional Download Links

Description

Creates a formatted suffix for figure titles that includes the sample size (N) range from a ggplot object. Optionally generates markdown download links for both the plot data and the plot image.

Usage

get_fig_title_suffix_from_ggplot(
  plot,
  save = FALSE,
  n_equals_string = "N = ",
  file_suffixes = c(".csv", ".png"),
  link_prefixes = c("[CSV](", "[PNG]("),
  save_fns = list(utils::write.csv, saros::ggsaver),
  sep = ", "
)

Arguments

plot

A ggplot2 object, typically created by makeme().

save

Logical flag. If TRUE, generates download links for the plot data (CSV) and plot image (PNG). If FALSE (default), only returns the N range text.

n_equals_string

String. Prefix text for the sample size display (default: "N = ").

file_suffixes

Character vector. File extensions for the saved plot images (default: ".png"). Should include the dot.

link_prefixes

Character vector. Markdown link text prefixes for the plot download links (default: "[PNG](").

save_fns

List of functions. Functions to save the plot data and images.

sep

String. Separator between N range text and download links (default: ", ").

Details

This function is particularly useful for adding informative captions to plots in reports. The N range is calculated using n_range2(), which extracts the sample size from the plot data. When save = TRUE, the function creates downloadable files using make_link():

Plot data as CSV (via utils::write.csv)
Plot image as PNG (via ggsaver())

The function returns an AsIs object to prevent automatic character escaping in markdown/HTML contexts.

Value

An AsIs object (using I()) containing a character string with:

Sample size range formatted as "{n_equals_string}{range}"
If save = TRUE: additional download links for plot data and image, separated by sep
Empty string if plot is not a valid ggplot object or has no data

Examples

# Create a sample plot
plot <- makeme(data = ex_survey, dep = b_1:b_3)

# Get just the N range text
get_fig_title_suffix_from_ggplot(plot)

# Custom N prefix
get_fig_title_suffix_from_ggplot(plot, n_equals_string = "Sample size: ")

## Not run: 
# Generate with download links (saves files to disk)
get_fig_title_suffix_from_ggplot(plot, save = TRUE)

# Custom separator and link prefix
get_fig_title_suffix_from_ggplot(
  plot,
  save = TRUE,
  sep = " | ",
  link_prefix = "[Download PNG]("
)

## End(Not run)

Get the name of the independent variable column

Description

Get the name of the independent variable column

Usage

get_indep_col_name(data)

Arguments

data

Dataset

Value

Character string with column name, or NULL if not found

Get independent variable labels

Description

Process independent variable labels with consistent logic across table functions

Usage

get_indep_labels(dots)

Arguments

dots

List from rlang::list2(...)

Value

Character vector of processed labels

Get all registered options for the type-argument in the `makeme`-function

Description

The makeme()-function take for the argument type one of several strings to indicate content type and output type. This function collects all registered alternatives. Extensions are possible, see further below.

Built-in types:

Whereas the names of the types can be arbitrary, a pattern is pursued in the built-in types. Prefix indicates what dependent data type it is intended for

"cat": Categorical (ordinal and nominal) data.
"chr": Open ended responses and other character data.
"int": Integer and numeric data.

Suffix indicates output

"html": Interactive html, usually what you want for Quarto, as Quarto can usually convert to other formats when needed
"docx": However, Quarto's and Pandoc's docx-support is currently still limited, for instance as vector graphics are converted to raster graphics for docx output. Hence, saros offers some types that outputs into MS Chart vector graphics. Note that this is experimental and not actively developed.
"pdf": This is basically just a shortcut for "html" with interactive=FALSE

Usage

get_makeme_types()

Value

Character vector

Further details about some of the built-in types:

"cat_plot_": A Likert style plot for groups of categorical variables sharing the same categories.
"cat_table_": A Likert style table.
"chr_table_": A single-column table listing unique open ended responses.
"sigtest_table_": See below

sigtest_table_\*: Make Table with All Combinations of Univariate/Bivariate Significance Tests Based on Variable Types

Although there are hundreds of significance tests for associations between two variables, depending upon the distributions, variables types and assumptions, most fall into a smaller set of popular tests. This function runs for all combinations of dependent and independent variables in data, with a suitable test (but not the only possible) for the combination. Also supports univariate tests, where the assumptions are that of a mean of zero for continuous variables or all equal proportions for binary/categorical.

This function does not allow any adjustments - use the original underlying functions for that (chisq.test, t.test, etc.)

Expanding with custom types

makeme() calls the generic make_content(), which uses the S3-method system to dispatch to the relevant method (i.e., paste0("make_content.", type)). makeme forwards all its arguments to make_content, with the following exceptions:

dep and indep are converted from dplyr::dplyr_tidy_select()-syntax to simple character vectors, for simplifying building your own functions.
data_summary is attached, which contains many useful pieces of info for many (categorical) displays.

Examples

get_makeme_types()

Helper function to extract raw variable labels from the data

Description

Helper function to extract raw variable labels from the data

Usage

get_raw_labels(data, col_pos = NULL, return_as_list = FALSE)

Arguments

data

Dataset

col_pos

Optional, character vector of column names or integer vector of positions

return_as_list

Flag, whether to return as list or character vector

Value

List or character vector

Get standard column renaming function

Description

Standardized column renaming logic for table functions

Usage

get_standard_column_renamer(
  main_question = "",
  use_header = FALSE,
  column_mappings = NULL
)

Arguments

main_question

Main question for header

use_header

Whether to use main question as header

column_mappings

Named list of additional column mappings

Value

Function for renaming columns

Get target categories for positional sorting

Description

Uses subset_vector to determine which categories to include based on positional methods like .top, .bottom, .upper, .lower, etc.

Usage

get_target_categories(data, method)

Arguments

data

Dataset with .category column

method

Positional method (.top, .bottom, .upper, .lower, etc.)

Value

Character vector of target category names

Wrapper Function for `ggplot2::ggsave()`

Description

This only exists to make it easy to use it in make_link()

Usage

ggsaver(plot, filename, ...)

Arguments

plot

Plot

filename

Note

...

Arguments forwarded to ggplot2::ggsave()

Value

No return value, called for side effects

Examples

library(ggplot2)
my_plot <- ggplot(data=mtcars, aes(x=hp, y=mpg)) + geom_point()
make_link(my_plot, folder=tempdir(), file_suffix = ".png",
          save_fn = ggsaver, width = 16, height = 16, units = "cm")

Pull global plotting settings before displaying plot

Description

This function extends ggiraph::girafe by allowing colour palettes to be globally specified.

Usage

girafe(
  ggobj,
  ...,
  char_limit = 200,
  label_wrap_width = 80,
  interactive = TRUE,
  palette_codes = NULL,
  priority_palette_codes = NULL,
  ncol = NULL,
  byrow = TRUE,
  colour_2nd_binary_cat = NULL,
  checked = NULL,
  not_checked = NULL,
  width_svg = NULL,
  height_svg = NULL,
  pointsize = 12
)

Arguments

ggobj

ggplot2-object.

...

Dots forwarded to ggiraph::girafe()

char_limit

Integer. Number of characters to fit on a line of plot (legend-space). Will be replaced in the future with a function that guesses this.

label_wrap_width

Integer. Number of characters fit on the axis text space before wrapping.

interactive

Boolean. Whether to produce a ggiraph-plot with interactivity (defaults to TRUE) or a static ggplot2-plot.

palette_codes

Optional list of named character vectors with names being categories and values being colours. The final character vector of the list is taken as a final resort. Defaults to NULL.

priority_palette_codes

Optional named character of categories (as names) with corresponding colours (as values) which are used first, whereupon the remaining unspecified categories are pulled from the last vector of palette_codes. Defaults to NULL.

ncol

Optional integer or NULL.

byrow

Whether to display legend keys by row or by column.

colour_2nd_binary_cat

Optional string. Color for the second category in binary checkbox plots. When set together with checked and not_checked, reverses the category order so that not_checked appears second and receives this color. Ignored if checkbox criteria are not met.

checked, not_checked

Optional string. If specified and the fill categories of the plot matches these, a special plot is returned where not_checked is hidden. Its usefulness comes in plots which are intended for checkbox responses where unchecked is not always a conscious choice.

pointsize, height_svg, width_svg

See ggiraph::girafe().

Value

If interactive, only side-effect of generating ggiraph-plot. If interactive=FALSE, returns modified ggobj.

Examples

plot <- makeme(data = ex_survey, dep = b_1)
girafe(plot)

Get Global Options for saros-functions

Description

Get Global Options for saros-functions

Usage

global_settings_get(fn_name = "makeme")

Arguments

fn_name

String, one of "make_link", "fig_height_h_barchart" and "makeme".

Value

List with options in R

Examples

global_settings_get()

Reset Global Options for saros-functions

Description

Reset Global Options for saros-functions

Usage

global_settings_reset(fn_name = "makeme")

Arguments

fn_name

String, one of "make_link", "fig_height_h_barchart" and "makeme".

Value

Invisibly returned list of old and new values.

Examples

global_settings_reset()

Get Global Options for saros-functions

Description

Get Global Options for saros-functions

Usage

global_settings_set(
  new,
  fn_name = "makeme",
  quiet = FALSE,
  null_deletes = FALSE
)

Arguments

new

List of arguments (see ?make_link(), ?makeme(), fig_height_h_barchart())

fn_name

String, one of "make_link", "fig_height_h_barchart" and "makeme".

quiet

Flag. If FALSE (default), informs about what has been set.

null_deletes

Flag. If FALSE (default), NULL elements in new become NULL elements in the option. Otherwise, the corresponding element, if present, is deleted from the option.

Value

Invisibly returned list of old and new values.

Examples

global_settings_set(new=list(digits=2))

Handle Kept and Omitted Columns for Crowds

Description

Internal helper function that processes the kept and omitted column information for crowd-based filtering and applies global hiding logic.

Usage

handle_crowd_columns(
  args,
  kept_cols_list,
  omitted_cols_list,
  kept_indep_cats_list
)

Arguments

args

List of makeme function arguments

kept_cols_list

Named list of kept column information for each crowd

omitted_cols_list

Named list of omitted variables for each crowd

Value

List containing processed crowd column information with global hiding logic applied based on hide_for_all_crowds_if_hidden_for_crowd settings

Identify Suitable Font Given Background Hex Colour

Description

Code is taken from XXX.

Usage

hex_bw(hex_code, na_colour = "#ffffff")

Arguments

hex_code

Colour in hex-format.

Value

Colours in hex-format, either black or white.

Validate and Initialize Arguments

Description

Internal helper function that finalizes the arguments list by adding resolved variable names and normalizing multi-value arguments.

Usage

initialize_arguments(data, dep_pos, indep_pos, args)

Arguments

data

Data frame being analyzed

dep_pos

Named integer vector of dependent variable positions

indep_pos

Named integer vector of independent variable positions

args

List of makeme function arguments

Value

Modified args list with additional elements:

data: The input data frame
dep: Character vector of dependent variable names (from dep_pos)
indep: Character vector of independent variable names (from indep_pos)
Normalized single-value arguments (showNA, data_label, type)

Initialize Crowd-Based Filtering Data Structures

Description

Internal helper function that sets up the data structures needed for crowd-based filtering and processing of variables and categories.

Usage

initialize_crowd_filtering(crowd, args)

Arguments

crowd

Character vector of crowd identifiers

args

List of makeme function arguments

Details

For each crowd, this function calls keep_cols() and keep_indep_cats() to determine which variables and categories should be retained based on the various hiding criteria (NA values, sample sizes, etc.).

Value

List with three named elements:

kept_cols_list: Named list of kept column information for each crowd
omitted_cols_list: Named list of omitted variables for each crowd
kept_indep_cats_list: Named list of kept independent categories for each crowd

Are All Colours in Vector Valid Colours

Description

As title says. From: (https://stackoverflow.com/a/13290832/3315962)

Usage

is_colour(x)

Arguments

x

Character vector of colours in hex-format.

Value

Logical, or error.

Is x A String?

Description

Returns TRUE if object is a character of length 1.

Usage

is_string(x)

Arguments

x

Object

Value

Logical value.

Method for Creating Saros Contents

Description

Takes the same arguments as makeme, except that dep and indep in make_content are character vectors, for ease of user-customized function programming.

Usage

make_content(type, ...)

Arguments

type

Method name

⁠scalar<character>⁠ with a class named by itself.

Optional string indicating the specific method. Occasionally useful for error messages, etc.

...

Dots

Arguments provided by makeme

Value

The returned object class depends on the type. type="*_table_html" always returns a tibble. type="*_plot_html" always returns a ggplot. type="*_docx" always returns a rdocx object if path=NULL, or has side-effect of writing docx file to disk if path is set.

Save data to a file and return a Markdown link

Description

The file is automatically named by a hash of the object, removing the need to come up with unique file names inside a Quarto report. This has the added benefit of reducing storage needs if the objects needing linking to are identical, and all are stored in the same folder. It also allows the user to download multiple files without worrying about accidentally overwriting them.

Usage

make_link(
  data,
  folder = NULL,
  file_prefix = NULL,
  file_suffix = ".csv",
  save_fn = utils::write.csv,
  link_prefix = "[download figure data](",
  link_suffix = ")",
  ...
)

Arguments

data

Data or object

⁠<data.frame|tbl|obj>⁠

Data frame if using a tabular data save_fn, or possibly any R object, if a serializing save_fn is provided (e.g. saveRDS()).

folder

Where to store file

⁠scalar<character>⁠ // default: "." (optional)

Defaults to same folder.

file_prefix, file_suffix

File prefix/suffix

⁠scalar<character>⁠ // default: "" and ".csv" (optional)

file_suffix should include the dot before the extension.

save_fn

Saving function

function // default: utils::write.csv

Can be any saving/writing function. However, first argument must be the object to be saved, and the second must be the path. Hence, ggplot2::ggsave() must be wrapped in another function with filename and object swapped. See ggsaver() for an example of such a wrapper function.

link_prefix, link_suffix

Link prefix/suffix

⁠scalar<character>⁠ // default: "[download data](" and ")"

The stuff that is returned.

...

Dynamic dots

<dynamic-dots>

Arguments forwarded to the corresponding functions that create the elements.

Value

String.

Examples

make_link(mtcars, folder = tempdir())

Save data to a file and return a Markdown link

Description

Usage

## Default S3 method:
make_link(
  data,
  ...,
  folder = NULL,
  file_prefix = NULL,
  file_suffix = ".csv",
  save_fn = utils::write.csv,
  link_prefix = "[download figure data](",
  link_suffix = ")"
)

Arguments

data

Data or object

⁠<data.frame|tbl|obj>⁠

Data frame if using a tabular data save_fn, or possibly any R object, if a serializing save_fn is provided (e.g. saveRDS()).

...

Dynamic dots

<dynamic-dots>

Arguments forwarded to the corresponding functions that create the elements.

folder

Where to store file

⁠scalar<character>⁠ // default: "." (optional)

Defaults to same folder.

file_prefix, file_suffix

File prefix/suffix

⁠scalar<character>⁠ // default: "" and ".csv" (optional)

file_suffix should include the dot before the extension.

save_fn

Saving function

function // default: utils::write.csv

link_prefix, link_suffix

Link prefix/suffix

⁠scalar<character>⁠ // default: "[download data](" and ")"

The stuff that is returned.

Value

String.

Examples

make_link(mtcars, folder = tempdir())

Save data to a file and return a Markdown link

Description

Usage

## S3 method for class 'list'
make_link(
  data,
  ...,
  folder = NULL,
  file_prefix = NULL,
  file_suffix = ".csv",
  save_fn = utils::write.csv,
  link_prefix = "[download figure data](",
  link_suffix = ")",
  separator_list_items = ". "
)

Arguments

data

Data or object

⁠<data.frame|tbl|obj>⁠

Data frame if using a tabular data save_fn, or possibly any R object, if a serializing save_fn is provided (e.g. saveRDS()).

...

Dynamic dots

<dynamic-dots>

Arguments forwarded to the corresponding functions that create the elements.

folder

Where to store file

⁠scalar<character>⁠ // default: "." (optional)

Defaults to same folder.

file_prefix, file_suffix

File prefix/suffix

⁠scalar<character>⁠ // default: "" and ".csv" (optional)

file_suffix should include the dot before the extension.

save_fn

Saving function

function // default: utils::write.csv

link_prefix, link_suffix

Link prefix/suffix

⁠scalar<character>⁠ // default: "[download data](" and ")"

The stuff that is returned.

separator_list_items

Separator string between multiple list items

⁠scalar<character>⁠ // default: ". " (optional)

Embed Interactive Plot of Various Kinds Using Tidyselect Syntax

Description

This function allows embedding of interactive or static plots based on various types of data using tidyselect syntax for variable selection.

Usage

makeme(
  data,
  dep = tidyselect::everything(),
  indep = NULL,
  type = c("cat_plot_html", "int_plot_html", "cat_table_html", "int_table_html",
    "sigtest_table_html", "cat_prop_plot_docx", "cat_freq_plot_docx", "int_plot_docx"),
  ...,
  require_common_categories = TRUE,
  crowd = c("all"),
  mesos_var = NULL,
  mesos_group = NULL,
  simplify_output = TRUE,
  hide_for_crowd_if_all_na = TRUE,
  hide_for_crowd_if_valid_n_below = 0,
  hide_for_crowd_if_category_k_below = 2,
  hide_for_crowd_if_category_n_below = 0,
  hide_for_crowd_if_cell_n_below = 0,
  hide_for_all_crowds_if_hidden_for_crowd = NULL,
  hide_indep_cat_for_all_crowds_if_hidden_for_crowd = FALSE,
  add_n_to_dep_label = FALSE,
  add_n_to_indep_label = FALSE,
  add_n_to_label = FALSE,
  add_n_to_category = FALSE,
  totals = FALSE,
  categories_treated_as_na = NULL,
  label_separator = " - ",
  error_on_duplicates = TRUE,
  showNA = c("ifany", "always", "never"),
  data_label = c("percentage_bare", "percentage", "proportion", "count", "mean",
    "median"),
  data_label_position = c("center", "bottom", "top", "above"),
  html_interactive = TRUE,
  hide_axis_text_if_single_variable = TRUE,
  hide_label_if_prop_below = 0.01,
  inverse = FALSE,
  vertical = FALSE,
  digits = 0,
  data_label_decimal_symbol = ".",
  x_axis_label_width = 25,
  strip_width = 25,
  sort_dep_by = ".variable_position",
  sort_indep_by = ".factor_order",
  sort_by = NULL,
  descend = TRUE,
  descend_indep = FALSE,
  labels_always_at_top = NULL,
  labels_always_at_bottom = NULL,
  table_wide = TRUE,
  table_main_question_as_header = FALSE,
  n_categories_limit = 12,
  translations = list(last_sep = " and ", table_heading_N = "Total (N)",
    table_heading_data_label = "%", add_n_to_dep_label_prefix = " (N = ",
    add_n_to_dep_label_suffix = ")", add_n_to_indep_label_prefix = " (N = ",
    add_n_to_indep_label_suffix = ")", add_n_to_label_prefix = " (N = ",
    add_n_to_label_suffix = ")", add_n_to_category_prefix = " (N = [",
    add_n_to_category_infix = ",", add_n_to_category_suffix = "])", by_total =
    "Everyone", sigtest_variable_header_1 = "Var 1", sigtest_variable_header_2 = "Var 2",
    crowd_all = "All", 
     crowd_target = "Target", crowd_others = "Others"),
  plot_height = 15,
  colour_palette = NULL,
  colour_2nd_binary_cat = "#ffffff",
  colour_na = "grey",
  label_font_size = 6,
  main_font_size = 6,
  strip_font_size = 6,
  legend_font_size = 6,
  font_family = "sans",
  path = NULL,
  docx_template = NULL
)

Arguments

data

Your data.frame/tibble or srvyr-object (experimental)

data.frame // required

The data to be used for plotting.

dep, indep

Variable selections

<tidyselect> // Default: NULL, meaning everything for dep, nothing for indep.

Columns in data. dep is compulsory.

type

Kind of output

⁠scalar<character>⁠ // default: "cat_plot_html" (optional)

For a list of registered types in your session, use get_makeme_types().

...

Dynamic dots

<dynamic-dots>

Arguments forwarded to the corresponding functions that create the elements.

require_common_categories

Check common categories

⁠scalar<logical>⁠ // default: TRUE (optional)

Whether to check if all items share common categories.

crowd

Which group(s) to display results for

⁠vector<character>⁠ // default: c("target", "others", "all") (optional)

Choose whether to produce results for target (mesos) group, others, all, or combinations of these.

mesos_var

Variable in data indicating groups to tailor reports for

⁠scalar<character>⁠ // default: NULL (optional)

Column name in data indicating the groups for which mesos reports will be produced.

mesos_group

⁠scalar<character>⁠ // default: NULL (optional)

String, target group.

simplify_output

⁠scalar<logical>⁠ // default: TRUE

If TRUE, a list output with a single output element will return the element itself, whereas list with multiple elements will return the list.

hide_for_crowd_if_all_na

Hide variable from output if containing all NA

⁠scalar<boolean>⁠ // default: TRUE

Whether to remove all variables (in particular useful for mesos) if all values are NA

hide_for_crowd_if_valid_n_below

Hide variable if variable has < n observations

⁠scalar<integer>⁠ // default: 0

Whether to hide a variable for a crowd if variable contains fewer than n observations (always ignoring NA).

hide_for_crowd_if_category_k_below

Hide variable if < k categories

⁠scalar<integer>⁠ // default: 2

Whether to hide a variable for a crowd if variable contains fewer than k used categories (always ignoring NA). Defaults to 2 because a unitary plot/table is rarely informative.

hide_for_crowd_if_category_n_below

Hide variable if having a category with < n observations

⁠scalar<integer>⁠ // default: 0

Whether to hide a variable for a crowd if variable contains a category with less than n observations (ignoring NA) Cells with a 0 count is not considered as these are usually not a problem for anonymity.

hide_for_crowd_if_cell_n_below

Hide variable if having a cell with < n

⁠scalar<integer>⁠ // default: 0

Whether to hide a variable for a crowd if the combination of dep-indep results in a cell with less than n observations (ignoring NA). Cells with a 0 count is not considered as these are usually not a problem for anonymity.

hide_for_all_crowds_if_hidden_for_crowd

Conditional hiding

⁠scalar<character>⁠ // default: NULL (optional)

Select one of the crowd output groups. If selected, will hide a variable across all crowd-outputs if it for some reason is not displayed for hide_for_all_if_hidden_for_crowd. For instance, say:

⁠crowd = c("target", "others"), hide_variable_if_all_na = TRUE,⁠ hide_for_all_if_hidden_for_crowd = "target"

will hide variables from both target and others-outputs if all are NA in the target-group.

hide_indep_cat_for_all_crowds_if_hidden_for_crowd

Conditionally hide independent categories

⁠scalar<logical>⁠ // default: FALSE

If hide_for_all_crowds_if_hidden_for_crowd is specified, should categories of the indep variable(s) be hidden for a crowd if it does not exist for the crowds specified in hide_for_all_crowds_if_hidden_for_crowd? This is useful when e.g. indep is academic disciplines, mesos_var is institutions, and a specific institution is not interested in seeing academic disciplines they do not offer themselves.

add_n_to_dep_label, add_n_to_indep_label

Add N= to the variable label

⁠scalar<logical>⁠ // default: FALSE (optional)

For some plots and tables it is useful to attach the "N=" to the end of the label of the dependent and/or independent variable. Whether it is N or N_valid depends on your showNA-setting. See also translations$add_n_to_dep_label_prefix, translations$add_n_to_dep_label_suffix, translations$add_n_to_indep_label_prefix, translations$add_n_to_indep_label_suffix.

add_n_to_label

Add N= to the variable label of both dep and indep

⁠scalar<logical>⁠ // default: FALSE (optional)

For some plots and tables it is useful to attach the "N=" to the end of the label. Whether it is N or N_valid depends on your showNA-setting. See also translations$add_n_to_label_prefix and translations$add_n_to_label_suffix.

add_n_to_category

Add N= to the category

⁠scalar<logical>⁠ // default: FALSE (optional)

For some plots and tables it is useful to attach the "N=" to the end of the category. This will likely produce a range across the variables, hence an infix (comma) between the minimum and maximum can be specified. Whether it is N or N_valid depends on your showNA-setting. See also translations$add_n_to_category_prefix, translations$add_n_to_category_infix, and translations$add_n_to_category_suffix.

totals

Include totals

⁠scalar<logical>⁠ // default: FALSE (optional)

Whether to include totals in the output.

categories_treated_as_na

NA categories

⁠vector<character>⁠ // default: NULL (optional)

Categories that should be treated as NA.

label_separator

How to separate main question from sub-question

⁠scalar<character>⁠ // default: NULL (optional)

Separator for main question from sub-question.

error_on_duplicates

Error or warn on duplicate labels

⁠scalar<logical>⁠ // default: TRUE (optional)

Whether to abort (TRUE) or warn (FALSE) if the same label (suffix) is used across multiple variables.

showNA

Show NA categories

⁠vector<character>⁠ // default: c("ifany", "always", "never") (optional)

Choose whether to show NA categories in the results.

data_label

Data label

⁠scalar<character>⁠ // default: "proportion" (optional)

One of "proportion", "percentage", "percentage_bare", "count", "mean", or "median".

data_label_position

Data label position

⁠scalar<character>⁠ // default: "center" (optional)

Position of data labels on bars. One of "center" (middle of bar), "bottom" (bottom but inside bar), "top" (top but inside bar), or "above" (above bar outside).

html_interactive

Toggle interactive plot

⁠scalar<logical>⁠ // default: TRUE (optional)

Whether the plot is to be interactive (ggiraph) or static (ggplot2).

hide_axis_text_if_single_variable

Hide y-axis text if just a single variable

⁠scalar<boolean>⁠ // default: FALSE (optional)

Whether to hide text on the y-axis label if just a single variable.

hide_label_if_prop_below

Hide label threshold

⁠scalar<numeric>⁠ // default: NULL (optional)

Whether to hide label if below this value.

inverse

Flag to swap x-axis and faceting

⁠scalar<logical>⁠ // default: FALSE (optional)

If TRUE, swaps x-axis and faceting.

vertical

Display plot vertically

⁠scalar<logical>⁠ // default: FALSE (optional)

If TRUE, display plot vertically.

digits

Decimal places

⁠scalar<integer>⁠ // default: 0L (optional)

Number of decimal places.

data_label_decimal_symbol

Decimal symbol

⁠scalar<character>⁠ // default: "." (optional)

Decimal marker, some might prefer a comma ',' or something else entirely.

x_axis_label_width, strip_width

Label width of x-axis and strip texts in plots

⁠scalar<integer>⁠ // default: 20 (optional)

Width of the labels used for the categorical column names in x-axis texts and strip texts.

sort_dep_by

What to sort dependent variables by

⁠vector<character>⁠ // default: ".variable_position" (optional)

Sort dependent variables in output. When using indep-argument, sorting differs between ordered factors and unordered factors: Ordering of ordered factors is always respected in output (their levels define the base order). Unordered factors will be reordered by sort_dep_by.

NULL or ".variable_position": Sort by variable position in the supplied data frame (default).
".variable_label": Sort by the variable labels.
".variable_name": Sort by the variable names.
".top": The proportion for the highest category available in the variable.
".upper": The sum of the proportions for the categories above the middle category.
".mid_upper": The sum of the proportions for the categories including and above the middle category.
".mid_lower": The sum of the proportions for the categories including and below the middle category.
".lower": The sum of the proportions for the categories below the middle category.
".bottom": The proportions for the lowest category available in the variable.

sort_indep_by

What to sort independent variable categories by

⁠vector<character>⁠ // default: ".factor_order" (optional)

Sort independent variable categories in output. When ".factor_order", preserves the original factor level order for the independent variable. Passing NULL is accepted and treated as ".factor_order".

NULL: No sorting - preserves original factor level order (default).
".top": The proportion for the highest category available.
".upper": The sum of the proportions for the categories above the middle category.
".mid_upper": The sum of the proportions for the categories including and above the middle category.
".mid_lower": The sum of the proportions for the categories including and below the middle category.
".lower": The sum of the proportions for the categories below the middle category.
".bottom": The proportions for the lowest category available.
character(): Character vector of category labels to sum together.

sort_by

What to sort output by (legacy)

⁠vector<character>⁠ // default: NULL (optional)

DEPRECATED: Use sort_dep_by and sort_indep_by instead for clearer control. When specified, this parameter will be used for both dependent and independent sorting. If NULL (default), dependent variables will be sorted by .variable_position.

NULL: Uses .variable_position for dependent variables, no sorting for independent.
".top": The proportion for the highest category available in the variable.
".upper": The sum of the proportions for the categories above the middle category.
".mid_upper": The sum of the proportions for the categories including and above the middle category.
".mid_lower": The sum of the proportions for the categories including and below the middle category.
".lower": The sum of the proportions for the categories below the middle category.
".bottom": The proportions for the lowest category available in the variable.
".variable_label": Sort by the variable labels.
".variable_name": Sort by the variable names.
".variable_position": Sort by the variable position in the supplied data frame.
".by_group": The groups of the by argument.
character(): Character vector of category labels to sum together.

descend

Sorting order

⁠scalar<logical>⁠ // default: FALSE (optional)

Reverse sorting of sort_by in figures and tables. Works with both ordered and unordered factors - for ordered factors, it reverses the display order while preserving the inherent level ordering. See arrange_section_by for sorting of report sections.

descend_indep

Sorting order for independent variables

⁠scalar<logical>⁠ // default: FALSE (optional)

Reverse sorting of sort_indep_by in figures and tables. Works with both ordered and unordered factors - for ordered factors, it reverses the display order while preserving the inherent level ordering. See arrange_section_by for sorting of report sections.

labels_always_at_top, labels_always_at_bottom

Top/bottom variables

⁠vector<character>⁠ // default: NULL (optional)

Column names in data that should always be placed at the top or bottom of figures/tables.

table_wide

Pivot table wider

⁠scalar<logical>⁠ // default: FALSE (optional)

Whether to pivot table wider.

table_main_question_as_header

Table main question as header

⁠scalar<logical>⁠ // default: FALSE (optional)

Whether to include the main question as a header in the table.

n_categories_limit

Limit for cat_table_ wide format

⁠scalar<integer>⁠ // default: 12 (optional)

If there are more than this number of categories in the categorical variable, cat_table_* will have a long format instead of wide format.

translations

Localize your output

⁠list<character>⁠

A list of translations where the name is the code and the value is the translation. See the examples.

plot_height

DOCX-setting

⁠scalar<numeric>⁠ // default: 12 (optional)

DOCX plots need a height, which currently cannot be set easily with a Quarto chunk option.

colour_palette

Colour palette

⁠vector<character>⁠ // default: NULL (optional)

Must contain at least the number of unique values (including missing) in the data set.

colour_2nd_binary_cat

Colour for second binary category

⁠scalar<character>⁠ // default: "#ffffff" (optional)

Colour for the second category in binary variables. Often useful to hide this.

colour_na

Colour for NA category

⁠scalar<character>⁠ // default: NULL (optional)

Colour as a single string for NA values, if showNA is "ifany" or "always".

main_font_size, label_font_size, strip_font_size, legend_font_size

Font sizes

⁠scalar<integer>⁠ // default: 6 (optional)

ONLY FOR DOCX-OUTPUT. Other output is adjusted using e.g. ggplot2::theme() or set with a global theme (ggplot2::set_theme()). Font sizes for general text (6), data label text (3), strip text (6) and legend text (6).

font_family

Font family

⁠scalar<character>⁠ // default: "sans" (optional)

Word font family. See officer::fp_text.

path

Output path for DOCX

⁠scalar<character>⁠ // default: NULL (optional)

Path to save docx-output.

docx_template

Filename or rdocx object

⁠scalar<character>|<rdocx>-object⁠ // default: NULL (optional)

Can be either a valid character path to a reference Word file, or an existing rdocx-object in memory.

Value

ggplot-object, optionally an extended ggplot object with ggiraph features.

Examples

makeme(
  data = ex_survey,
  dep = b_1:b_2
)
makeme(
  data = ex_survey,
  dep = b_1:b_3, indep = c(x1_sex, x2_human),
  type = "sigtest_table_html"
)
makeme(
  data = ex_survey,
  dep = p_1:p_4, indep = x2_human,
  type = "cat_table_html"
)
makeme(
  data = ex_survey,
  dep = c_1:c_2, indep = x1_sex,
  type = "int_table_html"
)
makeme(
  data = ex_survey,
  dep = b_1:b_2,
  crowd = c("target", "others"),
  mesos_var = "f_uni",
  mesos_group = "Uni of A"
)

Provides a range (or single value) for N in data, given dep and indep

Description

Provides a range (or single value) for N in data, given dep and indep

Usage

n_range(
  data,
  dep,
  indep = NULL,
  mesos_var = NULL,
  mesos_group = NULL,
  glue_template_1 = "{n}",
  glue_template_2 = "[{n[1]}-{n[2]}]"
)

Arguments

data

Dataset

dep, indep

Tidyselect syntax

mesos_var

Optional, NULL or string specifying name of variable used to split dataset.

mesos_group

Optional, NULL or string specifying value in mesos_var indicating the target group.

glue_template_1, glue_template_2

String, for the case of a single value (1) or a range with minimum-maximum of values (2).

Value

String.

Examples

n_range(data = ex_survey, dep = b_1:b_3, indep = x1_sex)

Provides a range (or single value) for N in a `ggplot2`-object from `makeme()`

Description

Provides a range (or single value) for N in a ggplot2-object from makeme()

Usage

n_range2(ggobj, glue_template_1 = "{n}", glue_template_2 = "[{n[1]}-{n[2]}]")

Arguments

ggobj

A ggplot2-object.

glue_template_1, glue_template_2

String, for the case of a single value (1) or a range with minimum-maximum of values (2).

Value

String.

Examples

n_range2(makeme(data = ex_survey, dep = b_1:b_3))

Obtain range of N for a given data set and other settings.

Description

Obtain range of N for a given data set and other settings.

Usage

n_rng(
  data,
  dep,
  indep = NULL,
  crowd = "all",
  mesos_var = NULL,
  mesos_group = NULL,
  glue_template_1 = "{n}",
  glue_template_2 = "[{n[1]}-{n[2]}]"
)

Arguments

data

Dataset

dep, indep

Character vector, names of (in)dependent variables

crowd

String, one of "all", "target" or "others".

mesos_var

Optional, NULL or string specifying name of variable used to split dataset.

mesos_group

Optional, NULL or string specifying value in mesos_var indicating the target group.

glue_template_1, glue_template_2

String, for the case of a single value (1) or a range with minimum-maximum of values (2).

Value

Always a string.

Obtain range of N for a given `ggobj`.

Description

Obtain range of N for a given ggobj.

Usage

n_rng2(ggobj, glue_template_1 = "{n}", glue_template_2 = "[{n[1]}-{n[2]}]")

Arguments

ggobj

A ggplot2-object.

glue_template_1, glue_template_2

String, for the case of a single value (1) or a range with minimum-maximum of values (2).

Value

Always a string.

Normalize Multi-Choice Arguments to Single Values

Description

Internal helper function that ensures makeme arguments that might be vectors are normalized to single values by taking the first element.

Usage

normalize_makeme_arguments(args)

Arguments

args

List of makeme function arguments

Value

Modified args list with normalized single-value arguments:

showNA: First element of showNA vector
data_label: First element of data_label vector
data_label_position: First element of data_label_position vector
type: First element of evaluated type expression

Post-process Makeme Data (Legacy)

Description

Legacy function that combines both factor level processing and binary category color processing. Use the individual functions for new code.

Usage

post_process_makeme_data(
  data,
  indep = NULL,
  showNA = "never",
  colour_2nd_binary_cat = NULL
)

Arguments

data

Data frame containing the data

indep

Character string naming the independent variable (or NULL)

showNA

Character indicating how to handle NA values

colour_2nd_binary_cat

Color specification for second binary category

Value

Modified data frame

Process All Crowds and Generate Output

Description

Internal helper function that iterates through all crowd identifiers and generates the appropriate output for each crowd.

Usage

process_all_crowds(
  args,
  omitted_cols_list,
  kept_indep_cats_list,
  data,
  mesos_var,
  mesos_group,
  ...
)

Arguments

args

Validated list of makeme function arguments

omitted_cols_list

Named list of omitted variables for each crowd

kept_indep_cats_list

Named list of kept independent categories for each crowd

data

Data frame being analyzed

mesos_var

Mesos-level grouping variable

mesos_group

Specific mesos group identifier

...

Additional arguments passed to process_crowd_data

Value

Named list of crowd outputs:

Each element corresponds to one crowd identifier
Content depends on the specific makeme type requested
May contain plots, tables, or other analysis objects

Process Binary Category Colors

Description

Reverses the .category variable for binary categories when a special color condition is met. This is specific to categorical plot functionality.

Usage

process_binary_category_colors(
  data,
  showNA = "never",
  colour_2nd_binary_cat = NULL
)

Arguments

data

Data frame containing the data with .category column

showNA

Character indicating how to handle NA values

colour_2nd_binary_cat

Color specification for second binary category

Value

Modified data frame with potentially reversed .category levels

Process categorical data for showNA settings

Description

Handle NA categories based on showNA parameter for categorical tables

Usage

process_categorical_na(data, dots)

Arguments

data

Data frame with .category column

dots

List with showNA and indep settings

Value

Processed data frame

Process Data for a Single Crowd

Description

Internal helper function that handles the complete processing pipeline for a single crowd, from data filtering to final output generation.

Usage

process_crowd_data(
  crwd,
  args,
  omitted_cols_list,
  kept_indep_cats_list,
  data,
  mesos_var,
  mesos_group,
  ...
)

Arguments

crwd

Character string identifying the current crowd

args

List of makeme function arguments

omitted_cols_list

Named list of omitted variables for each crowd

kept_indep_cats_list

Named list of kept independent categories for each crowd

data

Data frame being analyzed

mesos_var

Mesos-level grouping variable

mesos_group

Specific mesos group identifier

...

Additional arguments passed to data summarization functions

Details

Complete processing pipeline:

Calculates omitted variables for the crowd
Filters data by crowd membership and variable exclusions
Applies independent category filtering if enabled
Detects variable types and generates data summary
Performs validation and post-processing
Generates final output via make_content()

Value

Final output object for the crowd, or NULL if no data remains:

Plot, table, or other analysis object depending on type
NULL if crowd has no valid data after filtering

Process Crowd Settings

Description

Internal helper function that reorders the crowd array to ensure priority crowds (specified in hide_for_all_crowds_if_hidden_for_crowd) are processed first.

Usage

process_crowd_settings(args)

Arguments

args

List of makeme function arguments

Value

Modified args list with reordered crowd vector:

Priority crowds (in hide_for_all_crowds_if_hidden_for_crowd) first
Remaining crowds after
This ensures global hiding logic is applied correctly

Process Independent Categories for Global Hiding Logic

Description

Internal helper function that applies global hiding logic to independent variable categories based on the hide_for_all_crowds_if_hidden_for_crowd setting.

Usage

process_global_indep_categories(
  kept_indep_cats_list,
  hide_for_all_crowds_if_hidden_for_crowd
)

Arguments

kept_indep_cats_list

Named list of kept independent categories for each crowd

hide_for_all_crowds_if_hidden_for_crowd

Character vector of crowd identifiers that determine global category exclusions

Value

Modified kept_indep_cats_list with global hiding logic applied:

For crowds not in hide_for_all_crowds_if_hidden_for_crowd: only categories that were kept in the priority crowds are retained
For priority crowds: original category lists are preserved

Process Independent Variable Factor Levels

Description

Reverses factor levels for independent variables, but only for unordered factors. Preserves the natural ordering of ordered factors.

Usage

process_indep_factor_levels(data, indep = NULL)

Arguments

data

Data frame containing the data

indep

Character string naming the independent variable (or NULL)

Value

Modified data frame with reversed factor levels for unordered factors

Process main question and extract suffixes

Description

Handle label separation and suffix extraction for table functions

Usage

process_main_question_and_suffixes(data, dots, col_basis)

Arguments

data

Data frame to process

dots

List from rlang::list2(...)

col_basis

Current column basis (.variable_label or .variable_name)

Value

List with processed data, main_question, and updated col_basis

Process Output Results

Description

Internal helper function that performs final processing of makeme output, including crowd renaming, NULL removal, and output simplification.

Usage

process_output_results(out, args)

Arguments

out

Named list of crowd outputs from process_all_crowds

args

List of makeme function arguments (for translations and simplify_output)

Value

Processed output in final form:

Crowds renamed according to translations if provided
NULL results removed
Single element extracted if simplify_output=TRUE and length=1
Empty data.frame returned if no valid results
Otherwise returns the full named list

Process data with standard table operations

Description

Apply column selection, renaming, and independent variable handling

Usage

process_table_data(
  data,
  col_basis,
  indep_vars = NULL,
  indep_label = character(),
  main_question = "",
  use_header = FALSE,
  stat_columns = NULL,
  column_mappings = NULL
)

Arguments

data

Data frame to process

col_basis

Column basis for variables

indep_vars

Independent variable columns

indep_label

Independent variable labels

main_question

Main question for headers

use_header

Whether to use main question as header

stat_columns

Statistical columns to include

column_mappings

Additional column mappings

Value

Processed data frame

Rename Crowd Outputs

Description

Internal helper function that renames crowd identifiers in the output based on provided translations.

Usage

rename_crowd_outputs(out, translations)

Arguments

out

Named list of crowd outputs

translations

Named list of translation mappings for crowd identifiers

Value

Modified out list with crowd names translated:

Names changed according to translations with crowd prefix pattern
Only string translations are applied
Untranslated crowds retain original names

Reorder Crowd Array Based on Hide Settings

Description

Internal helper function that reorders the crowd array to prioritize crowds specified in hide_for_all_crowds_if_hidden_for_crowd, ensuring they are processed first to determine variable exclusions early.

Usage

reorder_crowd_array(crowd, hide_for_all_crowds_if_hidden_for_crowd)

Arguments

crowd

Character vector of crowd identifiers

hide_for_all_crowds_if_hidden_for_crowd

Character vector of crowd identifiers that should be processed first to determine global exclusions

Value

Character vector with reordered crowd identifiers:

Priority crowds first (those in hide_for_all_crowds_if_hidden_for_crowd)
Remaining crowds after

Code-snippets copied and modified from tidytext-package https://github.com/juliasilge/tidytext/blob/main/R/reorder_within.R

Description

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

Usage

reorder_within(x, by, within, fun = mean, sep = "___", ...)

Arguments

x

Vector

by

Vector

within

Vector (factor)

fun

Function, defaults to the mean

sep

String, separator

...

Dots

Details

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Source

"Original: Ordering categories within ggplot2 Facets" by Tyler Rinker: https://trinkerrstuff.wordpress.com/2016/12/23/ordering-categories-within-ggplot2-facets/ Based on https://opensource.org/licenses/MIT Copyright (c) 2017, Julia Silge and David Robinson

Resolve Variable Overlaps Between Dependent and Independent Variables

Description

Internal helper function that handles cases where variables are selected for both dependent and independent roles. Automatically removes overlapping variables from the dependent list and provides user feedback.

Usage

resolve_variable_overlaps(dep, indep)

Arguments

dep

Character vector of dependent variable names

indep

Character vector of independent variable names

Details

If overlapping variables are found:

Informs user about the overlap via cli::cli_inform()
Removes overlapping variables from dep vector
Throws error if no dependent variables remain after removal

Value

Character vector of dependent variable names with overlaps removed

Round numeric statistics

Description

Apply rounding to numeric statistical columns

Usage

round_numeric_stats(data, digits)

Arguments

data

Data frame to process

digits

Number of decimal places

Value

Data frame with rounded numeric columns

Set factor levels based on order columns (for backward compatibility)

Description

Set factor levels based on order columns (for backward compatibility)

Usage

set_factor_levels_from_order(data)

Arguments

data

Dataset with order columns

Value

Dataset with factor levels set according to order

Setup and Validate Makeme Arguments

Description

Internal helper function that performs final argument setup and validation before processing. Consolidates variable resolution, normalization, and validation.

Usage

setup_and_validate_makeme_args(args, data, dep_pos, indep_pos, indep)

Arguments

args

List of makeme function arguments

data

Data frame being analyzed

dep_pos

Named integer vector of dependent variable positions

indep_pos

Named integer vector of independent variable positions

indep

Independent variable selection (for validation)

Value

Modified and validated args list ready for processing:

Variable names resolved from positions
Overlaps between dep and indep resolved
Multi-choice arguments normalized
All validation checks passed
Crowd array reordered for optimal processing

Setup table data from dots

Description

Common setup logic for table functions including data extraction and early return

Usage

setup_table_data(dots)

Arguments

dots

List from rlang::list2(...)

Value

List with data and should_return flag

Shift labels_always_at

Description

Shift labels_always_at

Usage

shift_labels_always_at(data, labels_always_at = NULL, after = Inf)

Arguments

data

Dataset

labels_always_at

Labels to move to bottom or top

after

Position to move labels to (0 = top, Inf = bottom)

Value

Dataset with data$.variable_label adjusted

Apply string wrapping to variables (character or factor)

Description

A utility function that applies string wrapping to both character and factor variables, preserving factor structure while wrapping the labels.

Usage

strip_wrap_var(x, width = Inf)

Arguments

x

Variable to wrap (character or factor)

width

Maximum width for wrapping

Value

Modified variable with wrapped text

Given Ordered Integer Vector, Return Requested Set.

Description

Useful for identifying which categories are to be collected.

Usage

subset_vector(
  vec,
  set = c(".top", ".upper", ".mid_upper", ".lower", ".mid_lower", ".bottom", ".spread"),
  spread_n = NULL,
  sort = FALSE
)

Arguments

vec

A vector of any type.

set

A character string, one of c(".top", ".upper", ".mid_upper", ".lower", ".mid_lower", ".bottom")

spread_n

The number of values to extract when set is "spread".

sort

Whether to sort the output, defaults to FALSE.

Value

Selected set of vector.

Summarize a survey dataset for use in tables and graphs

Description

Summarize a survey dataset for use in tables and graphs

Usage

summarize_cat_cat_data(
  data,
  dep = colnames(data),
  indep = NULL,
  ...,
  showNA = c("ifany", "always", "never"),
  totals = FALSE,
  sort_by = ".upper",
  sort_dep_by = NULL,
  sort_indep_by = ".factor_order",
  data_label = c("percentage_bare", "percentage", "proportion", "count", "mean",
    "median"),
  digits = 0,
  add_n_to_dep_label = FALSE,
  add_n_to_indep_label = FALSE,
  add_n_to_label = FALSE,
  add_n_to_category = FALSE,
  hide_label_if_prop_below = 0.01,
  data_label_decimal_symbol = ".",
  categories_treated_as_na = NULL,
  label_separator = NULL,
  descend = FALSE,
  descend_indep = FALSE,
  labels_always_at_bottom = NULL,
  labels_always_at_top = NULL,
  translations = list(),
  call = rlang::caller_env()
)

Arguments

data

Your data.frame/tibble or srvyr-object (experimental)

data.frame // required

The data to be used for plotting.

dep, indep

Variable selections

<tidyselect> // Default: NULL, meaning everything for dep, nothing for indep.

Columns in data. dep is compulsory.

...

Dynamic dots

<dynamic-dots>

Arguments forwarded to the corresponding functions that create the elements.

showNA

Show NA categories

⁠vector<character>⁠ // default: c("ifany", "always", "never") (optional)

Choose whether to show NA categories in the results.

totals

Include totals

⁠scalar<logical>⁠ // default: FALSE (optional)

Whether to include totals in the output.

sort_by

What to sort output by (legacy)

⁠vector<character>⁠ // default: NULL (optional)

NULL: Uses .variable_position for dependent variables, no sorting for independent.
".top": The proportion for the highest category available in the variable.
".upper": The sum of the proportions for the categories above the middle category.
".mid_upper": The sum of the proportions for the categories including and above the middle category.
".mid_lower": The sum of the proportions for the categories including and below the middle category.
".lower": The sum of the proportions for the categories below the middle category.
".bottom": The proportions for the lowest category available in the variable.
".variable_label": Sort by the variable labels.
".variable_name": Sort by the variable names.
".variable_position": Sort by the variable position in the supplied data frame.
".by_group": The groups of the by argument.
character(): Character vector of category labels to sum together.

sort_dep_by

What to sort dependent variables by

⁠vector<character>⁠ // default: ".variable_position" (optional)

NULL or ".variable_position": Sort by variable position in the supplied data frame (default).
".variable_label": Sort by the variable labels.
".variable_name": Sort by the variable names.
".top": The proportion for the highest category available in the variable.
".upper": The sum of the proportions for the categories above the middle category.
".mid_upper": The sum of the proportions for the categories including and above the middle category.
".mid_lower": The sum of the proportions for the categories including and below the middle category.
".lower": The sum of the proportions for the categories below the middle category.
".bottom": The proportions for the lowest category available in the variable.

sort_indep_by

What to sort independent variable categories by

⁠vector<character>⁠ // default: ".factor_order" (optional)

NULL: No sorting - preserves original factor level order (default).
".top": The proportion for the highest category available.
".upper": The sum of the proportions for the categories above the middle category.
".mid_upper": The sum of the proportions for the categories including and above the middle category.
".mid_lower": The sum of the proportions for the categories including and below the middle category.
".lower": The sum of the proportions for the categories below the middle category.
".bottom": The proportions for the lowest category available.
character(): Character vector of category labels to sum together.

data_label

Data label

⁠scalar<character>⁠ // default: "proportion" (optional)

One of "proportion", "percentage", "percentage_bare", "count", "mean", or "median".

digits

Decimal places

⁠scalar<integer>⁠ // default: 0L (optional)

Number of decimal places.

add_n_to_dep_label, add_n_to_indep_label

Add N= to the variable label

⁠scalar<logical>⁠ // default: FALSE (optional)

add_n_to_label

Add N= to the variable label of both dep and indep

⁠scalar<logical>⁠ // default: FALSE (optional)

add_n_to_category

Add N= to the category

⁠scalar<logical>⁠ // default: FALSE (optional)

hide_label_if_prop_below

Hide label threshold

⁠scalar<numeric>⁠ // default: NULL (optional)

Whether to hide label if below this value.

data_label_decimal_symbol

Decimal symbol

⁠scalar<character>⁠ // default: "." (optional)

Decimal marker, some might prefer a comma ',' or something else entirely.

categories_treated_as_na

NA categories

⁠vector<character>⁠ // default: NULL (optional)

Categories that should be treated as NA.

label_separator

How to separate main question from sub-question

⁠scalar<character>⁠ // default: NULL (optional)

Separator for main question from sub-question.

descend

Sorting order

⁠scalar<logical>⁠ // default: FALSE (optional)

descend_indep

Sorting order for independent variables

⁠scalar<logical>⁠ // default: FALSE (optional)

labels_always_at_top, labels_always_at_bottom

Top/bottom variables

⁠vector<character>⁠ // default: NULL (optional)

Column names in data that should always be placed at the top or bottom of figures/tables.

translations

Localize your output

⁠list<character>⁠

A list of translations where the name is the code and the value is the translation. See the examples.

call

Internal call

⁠obj:<call>⁠ // Default: rlang::caller_env() (optional)

Both the absolute and relative folderpaths are required, as strings.

Value

Dataset with the columns: .variable_name, .variable_label, .category, .count, .count_se, .count_per_dep, .count_per_indep_group, .proportion, .proportion_se, .mean, .mean_se, .median, indep-variable(s), .data_label, .comb_categories, .sum_value, .variable_label_prefix

Summarize Data Based on Variable Types

Description

Internal helper function that determines the appropriate data summarization approach based on variable types and calls the corresponding function.

Usage

summarize_data_by_type(args, subset_data, dep_crwd, indep_crwd, ...)

Arguments

args

List of makeme function arguments

subset_data

Data frame subset for the current crowd

dep_crwd

Character vector of dependent variable names for current crowd

indep_crwd

Character vector of independent variable names for current crowd

...

Additional arguments passed to summarization functions

Value

Modified args list with data_summary element added:

For integer/numeric variables: calls summarize_int_cat_data()
For factor/ordered variables: calls summarize_cat_cat_data() with full argument set

Read tabular data from various formats

Description

A wrapper function to read data from different file formats

Usage

tabular_read(path, format, ...)

Arguments

path

Character string specifying the file path

format

Character string specifying the format: "delim", "xlsx", "csv", "csv2", "tsv", "sav", "dta"

...

Additional arguments passed to the underlying read functions

Value

A data frame containing the loaded data

Write tabular data to various formats

Description

A wrapper function to write data frames to different file formats

Usage

tabular_write(object, path, format)

Arguments

object

A data frame to write

path

Character string specifying the output file path

format

Character string specifying the format: "delim", "xlsx", "csv", "csv2", "tsv", "sav", "dta"

Value

Invisibly returns TRUE on success, used for side effects

Examples

data <- data.frame(x = 1:3, y = letters[1:3])

# Write as CSV
tabular_write(data, tempfile(fileext = ".csv"), format = "csv")

# Write as Excel
tabular_write(data, tempfile(fileext = ".xlsx"), format = "xlsx")

# Write as SPSS
tabular_write(data, tempfile(fileext = ".sav"), format = "sav")

Extract Text Summary from Categorical Mesos Plots

Description

Generates text summaries comparing two groups from categorical mesos plot data. The function identifies meaningful differences between groups based on proportions of respondents selecting specific categories and produces narrative text descriptions.

Usage

txt_from_cat_mesos_plots(
  plots,
  min_prop_diff = 0.1,
  n_highest_categories = 1,
  flip_to_lowest_categories = FALSE,
  digits = 2,
  selected_categories_last_split = " or ",
  fallback_string = character(),
  glue_str_pos =
    c(paste0("For {var}, the target group has a higher proportion of respondents ",
    "({group_1}) than all others ({group_2}) who answered {selected_categories}."),
    paste0("More respondents answered {selected_categories} for {var} in the ",
    "target group ({group_1}) than in other groups ({group_2})."),
    paste0("The statement {var} shows {selected_categories} responses are more ",
    "common in the target group ({group_1}) compared to others ({group_2}).")),
  glue_str_neg =
    c(paste0("For {var}, the target group has a lower proportion of respondents ",
    "({group_1}) than all others ({group_2}) who answered {selected_categories}."),
    paste0("Fewer respondents answered {selected_categories} for {var} in the ",
    "target group ({group_1}) than in other groups ({group_2})."),
    paste0("The statement {var} shows {selected_categories} responses are less ",
    "common in the target group ({group_1}) compared to others ({group_2})."))
)

Arguments

plots

A list of two plot objects (or data frames with plot data) to compare. Each must contain columns: .variable_label, .category, .category_order, .proportion.

min_prop_diff

Numeric. Minimum proportion difference (default 0.10) required between groups to generate text. Differences below this threshold are ignored.

n_highest_categories

Integer. Number of top categories to include in the comparison (default 1). Categories are selected based on .category_order.

flip_to_lowest_categories

Logical. If TRUE, compare lowest categories instead of highest (default FALSE).

digits

Integer. Number of decimal places for rounding proportions (default 2).

selected_categories_last_split

Character. Separator for the last item when listing multiple categories (default " or ").

fallback_string

Character. String to return when validation fails (default character()).

glue_str_pos

Character vector. Templates for positive differences (group_1 > group_2). Available placeholders: {var}, {group_1}, {group_2}, {selected_categories}.

glue_str_neg

Character vector. Templates for negative differences (group_2 > group_1). Same placeholders as glue_str_pos.

Details

The function compares proportions between two groups for each variable in the plot data. One template is randomly selected from the provided vectors for variety in output text.

Value

A character vector of text summaries, one per variable with meaningful differences. Returns empty character vector if no plots provided or no meaningful differences found.

Examples

## Not run: 
# Create sample plot data
plot_data_1 <- data.frame(
  .variable_label = rep("Job satisfaction", 3),
  .category = factor(c("Low", "Medium", "High"), levels = c("Low", "Medium", "High")),
  .category_order = 1:3,
  .proportion = c(0.2, 0.3, 0.5)
)

plot_data_2 <- data.frame(
  .variable_label = rep("Job satisfaction", 3),
  .category = factor(c("Low", "Medium", "High"), levels = c("Low", "Medium", "High")),
  .category_order = 1:3,
  .proportion = c(0.3, 0.4, 0.3)
)

plots <- list(
  list(data = plot_data_1),
  list(data = plot_data_2)
)

# Generate text summaries
txt_from_cat_mesos_plots(plots, min_prop_diff = 0.10)

# Compare lowest categories instead
txt_from_cat_mesos_plots(
  plots,
  flip_to_lowest_categories = TRUE,
  min_prop_diff = 0.05
)

## End(Not run)

Validate single dependent variable requirement

Description

Common validation pattern for functions that require exactly one dependent variable.

Usage

validate_single_dep_var(dep, function_name)

Arguments

dep

Vector of dependent variables

function_name

Name of the function requiring validation (for error message)

Value

Nothing if valid, throws error if invalid

Perform Type-Specific Validation Checks

Description

Internal helper function that validates arguments based on the specific output type requested. Different types have different constraints.

Usage

validate_type_specific_constraints(args, data, indep, dep_pos)

Arguments

args

List of makeme function arguments

data

Data frame being analyzed

indep

Character vector of independent variable names

dep_pos

Named integer vector of dependent variable positions

Details

Current type-specific validations:

chr_table_html: Requires exactly one dependent variable

Value

NULL (function used for side effects - validation errors)

saros: Semi-Automatic Reporting of Ordinary Surveys

Description

Author(s)

See Also

Add response category ordering (only useful for long format cat-cat tables)

Description

Usage

Arguments

Value

Add dependent variable ordering

Description

Usage

Arguments

Value

Add independent variable category ordering

Description

Usage

Arguments

Value

Create sorting order variables for output dataframe

Description

Usage

Arguments

Value

Apply final arrangement based on order columns

Description

Usage

Arguments

Value

Apply label wrapping based on plot layout

Description

Usage

Arguments

Value

Apply legacy sorting adjustments for special cases

Description

Usage

Arguments

Value

Arrange output data by prespecified orders

Description

Usage

Arguments

Value

Apply sorting with optional descending order

Description

Usage

Arguments

Value

Calculate ordering based on a specific category value

Description

Usage

Arguments

Value

Calculate ordering based on a specific column value

Description

Usage

Arguments

Value

Calculate independent variable ordering based on a specific category value

Description

Usage

Arguments

Value

Calculate independent variable ordering based on a specific column value

Description

Usage

Arguments

Value

Calculate independent variable ordering based on position categories

Description

Usage

Arguments

Value

Calculate independent variable ordering based on multiple category values

Description

Usage

Arguments

Value

Calculate ordering based on multiple category values