Type: Package
Title: Semi-Automatic Reporting of Ordinary Surveys
Version: 1.6.0
Maintainer: Stephan Daus <stephus.daus@gmail.com>
Description: Offers a systematic way for conditional reporting of figures and tables for many (and bivariate combinations of) variables, typically from survey data. Contains interactive 'ggiraph'-based (https://CRAN.R-project.org/package=ggiraph) plotting functions and data frame-based summary tables (bivariate significance tests, frequencies/proportions, unique open ended responses, etc) with many arguments for customization, and extensions possible. Uses a global options() system for neatly reducing redundant code. Also contains tools for immediate saving of objects and returning a hashed link to the object, useful for creating download links to high resolution images upon rendering in 'Quarto'. Suitable for highly customized reports, primarily intended for survey research.
Note: Free to use for non-Norwegian institutions, otherwise see LICENSE.
License: MIT + file LICENSE
URL: https://nifu-no.github.io/saros/, https://github.com/NIFU-NO/saros
BugReports: https://github.com/NIFU-NO/saros/issues
Depends: R (≥ 4.2.0)
Imports: cli, dplyr, forcats, fs, ggiraph, ggplot2, glue, grDevices, lifecycle, mschart, officer, rlang, stringi, stats, tidyr, tidyselect, utils, vctrs
Suggests: covr, haven, labelled, quarto, knitr, readr, scales, spelling, srvyr, survey, testthat (≥ 3.0.0), tibble, vdiffr, withr, writexl, readxl
Config/testthat/edition: 3
Encoding: UTF-8
RoxygenNote: 7.3.3
Language: en-US
VignetteBuilder: quarto
Config/Needs/website: rmarkdown
Config/testthat/parallel: true
LazyData: true
NeedsCompilation: no
Packaged: 2025-11-10 11:32:55 UTC; py128
Author: Stephan Daus ORCID iD [aut, cre, cph], Julia Silge [ctb] (Author of internal scale_x_reordered), David Robinson [ctb] (Author of internal scale_x_reordered), Nordic Institute for The Studies of Innovation, Research and Education (NIFU) [fnd], Kristiania University College [fnd]
Repository: CRAN
Date/Publication: 2025-11-10 12:00:02 UTC

saros: Semi-Automatic Reporting of Ordinary Surveys

Description

Offers a systematic way for conditional reporting of figures and tables for many (and bivariate combinations of) variables, typically from survey data. Contains interactive 'ggiraph'-based (https://CRAN.R-project.org/package=ggiraph) plotting functions and data frame-based summary tables (bivariate significance tests, frequencies/proportions, unique open ended responses, etc) with many arguments for customization, and extensions possible. Uses a global options() system for neatly reducing redundant code. Also contains tools for immediate saving of objects and returning a hashed link to the object, useful for creating download links to high resolution images upon rendering in 'Quarto'. Suitable for highly customized reports, primarily intended for survey research.

Author(s)

Maintainer: Stephan Daus stephus.daus@gmail.com (ORCID) [copyright holder]

Other contributors:

See Also

Useful links:


Add response category ordering (only useful for long format cat-cat tables)

Description

Add response category ordering (only useful for long format cat-cat tables)

Usage

add_category_order(data, sort_by = NULL)

Arguments

data

Dataset

sort_by

Sorting method for response categories

Value

Dataset with .category_order column added


Add dependent variable ordering

Description

Add dependent variable ordering

Usage

add_dep_order(data, sort_by, descend = FALSE)

Arguments

data

Dataset

sort_by

Sorting method for dependent variables

descend

Whether to reverse the order

Value

Dataset with .dep_order column added


Add independent variable category ordering

Description

Add independent variable category ordering

Usage

add_indep_order(data, sort_by = ".factor_order", descend = FALSE)

Arguments

data

Dataset

sort_by

Sorting method for independent categories (NULL = no sorting)

descend

Whether to reverse the order

Value

Dataset with .indep_order column added


Create sorting order variables for output dataframe

Description

This module provides centralized sorting functionality to ensure consistent ordering across all output types (tables, plots) by using explicit order columns instead of relying on factor levels that can be overridden. Apply comprehensive sorting order to survey data

Usage

add_sorting_order_vars(
  data,
  sort_dep_by = ".variable_position",
  sort_indep_by = ".factor_order",
  sort_category_by = NULL,
  descend = FALSE,
  descend_indep = FALSE
)

Arguments

data

Dataset with survey results

sort_dep_by

How to sort dependent variables

sort_indep_by

How to sort independent variable categories

sort_category_by

How to sort response categories

descend

Whether to reverse the dependent variable order

Value

Dataset with added order columns: .dep_order, .indep_order, .category_order


Apply final arrangement based on order columns

Description

Apply final arrangement based on order columns

Usage

apply_final_arrangement(data)

Arguments

data

Dataset with order columns

Value

Arranged dataset


Apply label wrapping based on plot layout

Description

Helper function to consistently wrap variable labels based on whether they appear on facet strips or x-axis, and whether inverse layout is used.

Usage

apply_label_wrapping(
  data,
  indep_length,
  inverse,
  strip_width,
  x_axis_label_width
)

Arguments

data

Data frame containing .variable_label column

indep_length

Number of independent variables (0 or 1)

inverse

Logical, whether inverse layout is used

strip_width

Width for facet strip labels

x_axis_label_width

Width for x-axis labels

Value

Data frame with wrapped .variable_label column


Apply legacy sorting adjustments for special cases

Description

Apply legacy sorting adjustments for special cases

Usage

apply_legacy_sorting_adjustments(
  data,
  indep_names = character(0),
  translations = list()
)

Arguments

data

Dataset

indep_names

Independent variable names

translations

Translation strings

Value

Dataset with legacy adjustments applied


Arrange output data by prespecified orders

Description

Standard data arrangement for table functions

Usage

arrange_table_data(data, col_basis, indep_vars = NULL)

Arguments

data

Data frame to arrange

col_basis

Column to use as primary sort

indep_vars

Independent variable columns

Value

Arranged data frame


Apply sorting with optional descending order

Description

Unified helper to consistently handle ascending/descending sort order across all sorting functions.

Usage

arrange_with_order(data, order_col, descend = FALSE)

Arguments

data

Dataset to arrange

order_col

Symbol/name of the column to sort by

descend

Whether to sort in descending order

Value

Arranged dataset


Calculate ordering based on a specific category value

Description

Calculate ordering based on a specific category value

Usage

calculate_category_order(data, category_value, descend = FALSE)

Arguments

data

Dataset with .category and .count columns

category_value

The category value to sort by (e.g., "A bit")

descend

Logical indicating if sorting should be descending

Value

Numeric vector of ordering values


Calculate ordering based on a specific column value

Description

Calculate ordering based on a specific column value

Usage

calculate_column_order(data, column_name, descend = FALSE)

Arguments

data

Dataset

column_name

Name of the column to sort by

descend

Logical indicating if sorting should be descending

Value

Numeric vector of ordering values


Calculate independent variable ordering based on a specific category value

Description

Calculate independent variable ordering based on a specific category value

Usage

calculate_indep_category_order(
  data,
  category_value,
  indep_col,
  descend_indep = FALSE
)

Arguments

data

Dataset with independent variable columns

category_value

The category value to sort by (e.g., "Not at all")

indep_col

Name of the independent variable column

descend_indep

Logical indicating if sorting should be descending

Value

Numeric vector of ordering values


Calculate independent variable ordering based on a specific column value

Description

Calculate independent variable ordering based on a specific column value

Usage

calculate_indep_column_order(
  data,
  column_name,
  indep_col,
  descend_indep = FALSE
)

Arguments

data

Dataset

column_name

Name of the column to sort by

indep_col

Name of the independent variable column

descend_indep

Logical indicating if sorting should be descending

Value

Numeric vector of ordering values


Calculate independent variable ordering based on position categories

Description

Calculate independent variable ordering based on position categories

Usage

calculate_indep_proportion_order(
  data,
  method,
  indep_col,
  descend_indep = FALSE
)

Arguments

data

Dataset

method

Either ".upper", ".top", etc.

indep_col

Name of the independent variable column

descend_indep

Logical indicating if sorting should be descending

Value

Numeric vector of ordering values


Calculate independent variable ordering based on multiple category values

Description

Calculate independent variable ordering based on multiple category values

Usage

calculate_indep_sum_value_order(
  data,
  category_values,
  indep_col,
  descend_indep = FALSE
)

Arguments

data

Dataset

category_values

Vector of category values to sum

indep_col

Name of the independent variable column

descend_indep

Logical indicating if sorting should be descending

Value

Numeric vector of ordering values


Calculate ordering based on multiple category values

Description

Calculate ordering based on multiple category values

Usage

calculate_multiple_category_order(data, category_values, descend = FALSE)

Arguments

data

Dataset with .category and .count columns

category_values

Vector of category values to sum (e.g., c("A bit", "A lot"))

descend

Logical indicating if sorting should be descending

Value

Numeric vector of ordering values


Calculate proportion-based ordering for dependent variables

Description

Calculate proportion-based ordering for dependent variables

Usage

calculate_proportion_order(data, method, descend = FALSE)

Arguments

data

Dataset

method

Either ".upper", ".top", etc.

descend

Logical indicating if sorting should be descending

Value

Numeric vector of ordering values


Calculate ordering based on .sum_value (for category-based sorting)

Description

Calculate ordering based on .sum_value (for category-based sorting)

Usage

calculate_sum_value_order(data, descend = FALSE)

Arguments

data

Dataset with .sum_value column

descend

Logical indicating if sorting should be descending

Value

Numeric vector of ordering values


Convert List of Plots to Quarto Tabset

Description

Creates a Quarto tabset from a named list of ggplot2 objects, typically generated by makeme() with crowd parameter. Each plot becomes a tab with automatic height calculation and optional download links.

Usage

crowd_plots_as_tabset(
  plot_list,
  plot_type = c("cat_plot_html", "int_plot_html", "auto"),
  save = FALSE,
  fig_height = NULL,
  fig_height_int_default = 6
)

Arguments

plot_list

A named list of ggplot2 objects. Names become tab labels. Typically created with makeme(crowd = c("target", "others")).

plot_type

Character. Type of plots in the list. One of:

  • "cat_plot_html" (default): Categorical horizontal bar charts

  • "int_plot_html": Interval plots (violin/box plots)

  • "auto": Auto-detect from first non-NULL plot's data structure

save

Logical. If TRUE (default), generates download links for plot data and images via get_fig_title_suffix_from_ggplot().

fig_height

Numeric or NULL. Manual figure height override in inches. If NULL (default), height is calculated automatically based on plot_type.

fig_height_int_default

Numeric. Default height for interval plots when auto-calculation is not available (default: 6 inches).

Details

This function is designed to be called within a Quarto document code chunk. It generates markdown that creates a tabset where each non-NULL plot in plot_list appears as a separate tab.

Height Calculation:

Requirements:

Value

Invisibly returns NULL. The function's purpose is its side effect of printing Quarto markdown that creates a tabset.

See Also

Examples

## Not run: 
# In a Quarto document
plots <- makeme(
  data = ex_survey,
  dep = b_1:b_3,
  crowd = c("target", "others"),
  mesos_var = "f_uni",
  mesos_group = "Uni of A"
)

# Create tabset with auto-detection
crowd_plots_as_tabset(plots)

# Create tabset for interval plots
int_plots <- makeme(
  data = ex_survey,
  dep = c_1:c_2,
  indep = x1_sex,
  type = "int_plot_html",
  crowd = c("target", "others"),
  mesos_var = "f_uni",
  mesos_group = "Uni of A"
)
crowd_plots_as_tabset(int_plots, plot_type = "int_plot_html")

# Without download links
crowd_plots_as_tabset(plots, save = FALSE)

# With manual height override
crowd_plots_as_tabset(plots, fig_height = 8)

## End(Not run)

Detect Variable Types for Dependent and Independent Variables

Description

Internal helper function that examines the class of variables in the subset data to determine their types (factor, numeric, character, etc.).

Usage

detect_variable_types(subset_data, dep_crwd, indep_crwd)

Arguments

subset_data

Data frame subset containing the relevant variables

dep_crwd

Character vector of dependent variable names for current crowd

indep_crwd

Character vector of independent variable names for current crowd

Value

List with two elements:


Determine variable column basis

Description

Consistent logic for determining whether to use .variable_label or .variable_name

Usage

determine_variable_basis(data_summary)

Arguments

data_summary

Data frame with variable information

Value

String indicating column to use as basis


Embed Interactive Categorical Plot (DEPRECATED!)

Description

This function has been deprecated. Use instead makeme()

Usage

embed_cat_prop_plot(
  data,
  ...,
  dep = tidyselect::everything(),
  indep = NULL,
  colour_palette = NULL,
  mesos_group = NULL,
  html_interactive = TRUE,
  inverse = FALSE
)

Arguments

data

data.frame, tibble or potentially a srvyr-object.

...

Dynamic dots, arguments forwarded to underlying function(s).

dep

tidyselect-syntax for dependent variable(s).

indep

tidyselect-syntax for an optional independent variable.

colour_palette

Character vector. Avoid using this.

mesos_group

String

html_interactive

Flag, whether to include interactivity.

inverse

Flag, whether to flip plot or table.


Embed Reactable Table (DEPRECATED!)

Description

This function has been deprecated. Use instead makeme()

Usage

embed_cat_table(
  data,
  ...,
  dep = tidyselect::everything(),
  indep = NULL,
  mesos_group = NULL
)

Arguments

data

data.frame, tibble or potentially a srvyr-object.

...

Dynamic dots, arguments forwarded to underlying function(s).

dep

tidyselect-syntax for dependent variable(s).

indep

tidyselect-syntax for an optional independent variable.

mesos_group

String


Interactive table of text data (DEPRECATED)

Description

This function has been deprecated. Use instead makeme()

Usage

embed_chr_table_html(data, dep, ..., mesos_group = NULL)

Arguments

data

data.frame, tibble or potentially a srvyr-object.

dep

tidyselect-syntax for dependent variable(s).

...

Dynamic dots, arguments forwarded to underlying function(s).

mesos_group

String


Evaluate Variable Selection

Description

Internal helper function that evaluates tidyselect expressions for dependent and independent variables, returning their column positions in the data frame.

Usage

evaluate_variable_selection(data, dep, indep)

Arguments

data

A data frame containing the variables to be selected

dep

Quosure or tidyselect expression for dependent variables

indep

Quosure or tidyselect expression for independent variables

Value

A list with two named elements:


ex_survey: Mockup dataset of a survey.

Description

A dataset containing fake respondents' answers to survey questions. The first two, x_sex and x_human, are intended to be independent variables, whereas the remaining are dependent. The underscore _ in variable names separates item groups (prefix) from items (suffix) (i.e. a_1-a_9 => a + 1-9), whereas ' - ' separates the same for labels. The latter corresponds with the default in SurveyXact.

Usage

ex_survey

Format

A data frame with 100 rows and 29 variables:

x1_sex

Gender

x2_human

Is respondent human?

x3_nationality

Where is the respondent born?

a_1

Do you consent to the following? - Agreement #1

a_2

Do you consent to the following? - Agreement #2

a_3

Do you consent to the following? - Agreement #3

a_4

Do you consent to the following? - Agreement #4

a_5

Do you consent to the following? - Agreement #5

a_6

Do you consent to the following? - Agreement #6

a_7

Do you consent to the following? - Agreement #7

a_8

Do you consent to the following? - Agreement #8

a_9

Do you consent to the following? - Agreement #9

b_1

How much do you like living in - Beijing

b_2

How much do you like living in - Brussels

b_3

How much do you like living in - Budapest

c_1

How many years of experience do you have in - Company A

c_2

How many years of experience do you have in - Company B

d_1

Rate your degree of confidence doing the following - Driving

d_2

Rate your degree of confidence doing the following - Drinking

d_3

Rate your degree of confidence doing the following - Driving

d_4

Rate your degree of confidence doing the following - Dancing

e_1

How often do you do the following? - Eat

e_2

How often do you do the following? - Eavesdrop

e_3

How often do you do the following? - Exercise

e_4

How often do you do the following? - Encourage someone whom you have only recently met and who struggles with simple tasks that they cannot achieve by themselves

p_1

To what extent do you agree or disagree to the following policies - Red Party

p_2

To what extent do you agree or disagree to the following policies - Green Party

p_3

To what extent do you agree or disagree to the following policies - Yellow Party

p_4

To what extent do you agree or disagree to the following policies - Blue Party

f_uni

Which of the following universities would you prefer to study at?

open_comments

Do you have any comments to the survey?

resp_status

Response status


Estimate figure height for a horizontal bar chart

Description

This function estimates the height of a figure for a horizontal bar chart based on several parameters including the number of dependent and independent variables, number of categories, maximum characters in the labels, and legend properties.

Usage

fig_height_h_barchart(
  n_y,
  n_cats_y,
  max_chars_labels_y = 20,
  max_chars_cats_y = 20,
  n_x = NULL,
  n_cats_x = NULL,
  max_chars_labels_x = NULL,
  max_chars_cats_x = NULL,
  freq = FALSE,
  x_axis_label_width = 20,
  strip_width = 20,
  strip_angle = 0,
  main_font_size = 7,
  legend_location = c("plot", "panel"),
  n_legend_lines = NULL,
  legend_key_chars_equivalence = 5,
  multiplier_per_horizontal_line = 1,
  multiplier_per_vertical_letter = 1,
  multiplier_per_facet = 1,
  multiplier_per_bar = 1,
  multiplier_per_legend_line = 1,
  multiplier_per_plot = 1,
  fixed_constant = 0,
  margin_in_cm = 0,
  figure_width_in_cm = 14,
  max = 12,
  min = 2,
  hide_axis_text_if_single_variable = FALSE,
  add_n_to_dep_label = FALSE,
  add_n_to_indep_label = FALSE,
  showNA = c("ifany", "never", "always")
)

Arguments

n_y, n_x

Integer. Number of dependent/independent variables.

n_cats_y

Integer. Number of categories across the dependent variables.

max_chars_labels_y

Integer. Maximum number of characters across the dependent variables' labels.

max_chars_cats_y

Integer. Maximum number of characters across the dependent variables' response categories (levels).

n_cats_x

Integer or NULL. Number of categories across the independent variables.

max_chars_labels_x

Integer or NULL. Maximum number of characters across the independent variables' labels.

max_chars_cats_x

Integer or NULL. Maximum number of characters across the independent variables' response categories (levels).

freq

Logical. If TRUE, frequency plot with categories next to each other. If FALSE (default), proportion plot with stacked categories.

x_axis_label_width, strip_width

Numeric. Width allocated for x-axis labels and strip labels respectively.

strip_angle

Integer. Angle of the strip text.

main_font_size

Numeric. Font size for the main text.

legend_location

Character. Location of the legend. "plot" (default) or "panel".

n_legend_lines

Integer. Number of lines in the legend.

legend_key_chars_equivalence

Integer. Approximate number of characters the legend key equals.

multiplier_per_horizontal_line

Numeric. Multiplier per horizontal line.

multiplier_per_vertical_letter

Numeric. Multiplier per vertical letter.

multiplier_per_facet

Numeric. Multiplier per facet height.

multiplier_per_bar

Numeric. Multiplier per bar height (thickness).

multiplier_per_legend_line

Numeric. Multiplier per legend line.

multiplier_per_plot

Numeric. Multiplier for entire plot estimates.

fixed_constant

Numeric. Fixed constant to be added to the height.

margin_in_cm

Numeric. Margin in centimeters.

figure_width_in_cm

Numeric. Width of the figure in centimeters.

max

Numeric. Maximum height.

min

Numeric. Minimum height.

hide_axis_text_if_single_variable

Boolean. Whether the label is hidden for single dependent variable plots.

add_n_to_dep_label, add_n_to_indep_label

Boolean. If TRUE, will add 10 characters to the max label lengths. This is primarily useful when obtaining these settings from the global environment, avoiding the need to compute this for each figure chunk.

showNA

String, one of "ifany", "always" or "never". Not yet in use.

Value

Numeric value representing the estimated height of the figure.

Examples

fig_height_h_barchart(
  n_y = 5,
  n_cats_y = 3,
  max_chars_labels_y = 20,
  max_chars_cats_y = 8,
  n_x = 1,
  n_cats_x = 4,
  max_chars_labels_x = 12,
  freq = FALSE,
  x_axis_label_width = 20,
  strip_angle = 0,
  main_font_size = 8,
  legend_location = "panel",
  n_legend_lines = 2,
  legend_key_chars_equivalence = 5,
  multiplier_per_horizontal_line = 1,
  multiplier_per_vertical_letter = .15,
  multiplier_per_facet = .95,
  multiplier_per_legend_line = 1.5,
  figure_width_in_cm = 16
)

Estimate figure height for a horizontal bar chart

Description

Taking an object from makeme(), this function estimates the height of a figure for a horizontal bar chart.

Usage

fig_height_h_barchart2(
  ggobj,
  main_font_size = 7,
  strip_angle = 0,
  freq = FALSE,
  x_axis_label_width = 20,
  strip_width = 20,
  legend_location = c("plot", "panel"),
  n_legend_lines = NULL,
  showNA = c("ifany", "never", "always"),
  legend_key_chars_equivalence = 5,
  multiplier_per_horizontal_line = NULL,
  multiplier_per_vertical_letter = 1,
  multiplier_per_facet = 1,
  multiplier_per_legend_line = 1,
  fixed_constant = 0,
  figure_width_in_cm = 14,
  margin_in_cm = 0,
  max = 8,
  min = 1
)

Arguments

ggobj

ggplot2-object

main_font_size

Numeric. Font size for the main text.

strip_angle

Integer. Angle of the strip text.

freq

Logical. If TRUE, frequency plot with categories next to each other. If FALSE (default), proportion plot with stacked categories.

x_axis_label_width, strip_width

Numeric. Width allocated for x-axis labels and strip labels respectively.

legend_location

Character. Location of the legend. "plot" (default) or "panel".

n_legend_lines

Integer. Number of lines in the legend.

showNA

String, one of "ifany", "always" or "never". Not yet in use.

legend_key_chars_equivalence

Integer. Approximate number of characters the legend key equals.

multiplier_per_horizontal_line

Numeric. Multiplier per horizontal line.

multiplier_per_vertical_letter

Numeric. Multiplier per vertical letter.

multiplier_per_facet

Numeric. Multiplier per facet height.

multiplier_per_legend_line

Numeric. Multiplier per legend line.

fixed_constant

Numeric. Fixed constant to be added to the height.

figure_width_in_cm

Numeric. Width of the figure in centimeters.

margin_in_cm

Numeric. Margin in centimeters.

max

Numeric. Maximum height.

min

Numeric. Minimum height.

Value

Numeric value representing the estimated height of the figure.

Examples

fig_height_h_barchart2(makeme(data = ex_survey, dep = b_1:b_2, indep = x1_sex))

Filter and Prepare Data for a Specific Crowd

Description

Internal helper function that filters data for a specific crowd identifier, applying variable exclusions and category filtering as needed.

Usage

filter_crowd_data(data, args, crwd, omitted_cols_list, kept_indep_cats_list)

Arguments

data

Data frame being analyzed

args

List of makeme function arguments

crwd

Character string identifying the current crowd

omitted_cols_list

Named list of omitted variables for each crowd

kept_indep_cats_list

Named list of kept independent categories for each crowd

Details

Applies the following filtering steps:

Value

List with subset data and variables for the crowd, or NULL if no data remains:


Generate Output for a Specific Crowd

Description

Internal helper function that generates the final output object for a crowd by processing data summary and calling the appropriate make_content function.

Usage

generate_crowd_output(args, subset_data, dep_crwd, indep_crwd)

Arguments

args

List of makeme function arguments

subset_data

Data frame subset for the current crowd

dep_crwd

Character vector of dependent variable names for current crowd

indep_crwd

Character vector of independent variable names for current crowd

Details

Processing steps:

Value

Output object (type depends on makeme type):


Generate Appropriate Data Summary Based on Variable Types

Description

Internal helper function that routes to the appropriate data summarization function based on the detected variable types (categorical vs continuous).

Usage

generate_data_summary(
  variable_types,
  subset_data,
  dep_crwd,
  indep_crwd,
  args,
  ...
)

Arguments

variable_types

List with dep and indep variable type information

subset_data

Data frame subset for the current crowd

dep_crwd

Character vector of dependent variable names for current crowd

indep_crwd

Character vector of independent variable names for current crowd

args

List of makeme function arguments

...

Additional arguments passed to summarization functions

Value

Data summary object (type depends on variable types):


Provide A Colour Set for A Number of Requested Colours

Description

Possibly using colour_palette_nominal if available. If not sufficient, uses a set palette from RColorBrewer.

Usage

get_colour_palette(
  data,
  col_pos,
  colour_palette_nominal = NULL,
  colour_palette_ordinal = NULL,
  colour_na = NULL,
  categories_treated_as_na = NULL,
  call = rlang::caller_env()
)

Arguments

data

Your data.frame/tibble or srvyr-object (experimental)

data.frame // required

The data to be used for plotting.

col_pos

Character vector of column names for which colours will be found.

colour_palette_nominal, colour_palette_ordinal

User specified colour set

⁠vector<character>⁠ // default: NULL (optional)

User-supplied default palette, excluding colour_na.

colour_na

Colour for NA category

⁠scalar<character>⁠ // default: NULL (optional)

Colour as a single string for NA values, if showNA is "ifany" or "always".

categories_treated_as_na

NA categories

⁠vector<character>⁠ // default: NULL (optional)

Categories that should be treated as NA.

call

Internal call

⁠obj:<call>⁠ // Default: rlang::caller_env() (optional)

Both the absolute and relative folderpaths are required, as strings.

Value

A colour set as character vector, where NA has the colour_na, and the rest are taken from colour_palette_nominal if available.


Provide A Colour Set for A Number of Requested Colours

Description

Possibly using colour_palette_nominal if available. If not sufficient, uses a set palette from RColorBrewer.

Usage

get_colour_set(
  x,
  common_data_type = "factor",
  colour_palette_nominal = NULL,
  colour_palette_ordinal = NULL,
  colour_na = NULL,
  colour_2nd_binary_cat = NULL,
  ordinal = FALSE,
  categories_treated_as_na = NULL,
  call = rlang::caller_env()
)

Arguments

x

Vector for which colours will be found.

common_data_type

factor or ordered data type

⁠scalar<character>⁠ // default: factor (optional)

Currently only supports factor and ordered.

colour_palette_nominal, colour_palette_ordinal

User specified colour set

⁠vector<character>⁠ // default: NULL (optional)

User-supplied default palette, excluding colour_na.

colour_na

Colour for NA category

⁠scalar<character>⁠ // default: NULL (optional)

Colour as a single string for NA values, if showNA is "ifany" or "always".

colour_2nd_binary_cat

Colour for second binary category

⁠scalar<character>⁠ // default: "#ffffff" (optional)

Colour for the second category in binary variables. Often useful to hide this.

ordinal

⁠scalar<logical>⁠ // default: FALSE (optional)

Is palette ordinal?

categories_treated_as_na

NA categories

⁠vector<character>⁠ // default: NULL (optional)

Categories that should be treated as NA.

call

Internal call

⁠obj:<call>⁠ // Default: rlang::caller_env() (optional)

Both the absolute and relative folderpaths are required, as strings.

Value

A colour set as character vector, where NA has the colour_na, and the rest are taken from colour_palette_nominal if available.


Determine display column based on data availability

Description

Checks if .variable_label column exists and has non-NA values to determine whether to use .variable_label or .variable_name for display.

Usage

get_data_display_column(data)

Arguments

data

Data frame containing variable information

Value

Character string indicating which column to use


Get Valid Data Labels for Figures and Tables

Description

Get Valid Data Labels for Figures and Tables

Usage

get_data_label_opts()

Value

Character vector


Determine display column for dependent variables in int_plot_html

Description

Checks if the number of dep variables matches the number of labels to determine whether to use .variable_label or .variable_name for display.

Usage

get_dep_display_column(dep_count, dep_labels)

Arguments

dep_count

Number of dependent variables

dep_labels

Vector of dependency labels

Value

Character string indicating which column to use


Generate Figure Title Suffix with N Range and Optional Download Links

Description

Creates a formatted suffix for figure titles that includes the sample size (N) range from a ggplot object. Optionally generates markdown download links for both the plot data and the plot image.

Usage

get_fig_title_suffix_from_ggplot(
  plot,
  save = FALSE,
  n_equals_string = "N = ",
  file_suffixes = c(".csv", ".png"),
  link_prefixes = c("[CSV](", "[PNG]("),
  save_fns = list(utils::write.csv, saros::ggsaver),
  sep = ", "
)

Arguments

plot

A ggplot2 object, typically created by makeme().

save

Logical flag. If TRUE, generates download links for the plot data (CSV) and plot image (PNG). If FALSE (default), only returns the N range text.

n_equals_string

String. Prefix text for the sample size display (default: "N = ").

file_suffixes

Character vector. File extensions for the saved plot images (default: ".png"). Should include the dot.

link_prefixes

Character vector. Markdown link text prefixes for the plot download links (default: "[PNG](").

save_fns

List of functions. Functions to save the plot data and images.

sep

String. Separator between N range text and download links (default: ", ").

Details

This function is particularly useful for adding informative captions to plots in reports. The N range is calculated using n_range2(), which extracts the sample size from the plot data. When save = TRUE, the function creates downloadable files using make_link():

The function returns an AsIs object to prevent automatic character escaping in markdown/HTML contexts.

Value

An AsIs object (using I()) containing a character string with:

See Also

Examples

# Create a sample plot
plot <- makeme(data = ex_survey, dep = b_1:b_3)

# Get just the N range text
get_fig_title_suffix_from_ggplot(plot)

# Custom N prefix
get_fig_title_suffix_from_ggplot(plot, n_equals_string = "Sample size: ")

## Not run: 
# Generate with download links (saves files to disk)
get_fig_title_suffix_from_ggplot(plot, save = TRUE)

# Custom separator and link prefix
get_fig_title_suffix_from_ggplot(
  plot,
  save = TRUE,
  sep = " | ",
  link_prefix = "[Download PNG]("
)

## End(Not run)

Get the name of the independent variable column

Description

Get the name of the independent variable column

Usage

get_indep_col_name(data)

Arguments

data

Dataset

Value

Character string with column name, or NULL if not found


Get independent variable labels

Description

Process independent variable labels with consistent logic across table functions

Usage

get_indep_labels(dots)

Arguments

dots

List from rlang::list2(...)

Value

Character vector of processed labels


Get all registered options for the type-argument in the makeme-function

Description

The makeme()-function take for the argument type one of several strings to indicate content type and output type. This function collects all registered alternatives. Extensions are possible, see further below.

Built-in types:

Whereas the names of the types can be arbitrary, a pattern is pursued in the built-in types. Prefix indicates what dependent data type it is intended for

"cat"

Categorical (ordinal and nominal) data.

"chr"

Open ended responses and other character data.

"int"

Integer and numeric data.

Suffix indicates output

"html"

Interactive html, usually what you want for Quarto, as Quarto can usually convert to other formats when needed

"docx"

However, Quarto's and Pandoc's docx-support is currently still limited, for instance as vector graphics are converted to raster graphics for docx output. Hence, saros offers some types that outputs into MS Chart vector graphics. Note that this is experimental and not actively developed.

"pdf"

This is basically just a shortcut for "html" with interactive=FALSE

Usage

get_makeme_types()

Value

Character vector

Further details about some of the built-in types:

"cat_plot_"

A Likert style plot for groups of categorical variables sharing the same categories.

"cat_table_"

A Likert style table.

"chr_table_"

A single-column table listing unique open ended responses.

"sigtest_table_"

See below

sigtest_table_\*: Make Table with All Combinations of Univariate/Bivariate Significance Tests Based on Variable Types

Although there are hundreds of significance tests for associations between two variables, depending upon the distributions, variables types and assumptions, most fall into a smaller set of popular tests. This function runs for all combinations of dependent and independent variables in data, with a suitable test (but not the only possible) for the combination. Also supports univariate tests, where the assumptions are that of a mean of zero for continuous variables or all equal proportions for binary/categorical.

This function does not allow any adjustments - use the original underlying functions for that (chisq.test, t.test, etc.)

Expanding with custom types

makeme() calls the generic make_content(), which uses the S3-method system to dispatch to the relevant method (i.e., paste0("make_content.", type)). makeme forwards all its arguments to make_content, with the following exceptions:

  1. dep and indep are converted from dplyr::dplyr_tidy_select()-syntax to simple character vectors, for simplifying building your own functions.

  2. data_summary is attached, which contains many useful pieces of info for many (categorical) displays.

Examples

get_makeme_types()

Helper function to extract raw variable labels from the data

Description

Helper function to extract raw variable labels from the data

Usage

get_raw_labels(data, col_pos = NULL, return_as_list = FALSE)

Arguments

data

Dataset

col_pos

Optional, character vector of column names or integer vector of positions

return_as_list

Flag, whether to return as list or character vector

Value

List or character vector


Get standard column renaming function

Description

Standardized column renaming logic for table functions

Usage

get_standard_column_renamer(
  main_question = "",
  use_header = FALSE,
  column_mappings = NULL
)

Arguments

main_question

Main question for header

use_header

Whether to use main question as header

column_mappings

Named list of additional column mappings

Value

Function for renaming columns


Get target categories for positional sorting

Description

Uses subset_vector to determine which categories to include based on positional methods like .top, .bottom, .upper, .lower, etc.

Usage

get_target_categories(data, method)

Arguments

data

Dataset with .category column

method

Positional method (.top, .bottom, .upper, .lower, etc.)

Value

Character vector of target category names


Wrapper Function for ggplot2::ggsave()

Description

This only exists to make it easy to use it in make_link()

Usage

ggsaver(plot, filename, ...)

Arguments

plot

Plot

filename

Note

...

Arguments forwarded to ggplot2::ggsave()

Value

No return value, called for side effects

Examples

library(ggplot2)
my_plot <- ggplot(data=mtcars, aes(x=hp, y=mpg)) + geom_point()
make_link(my_plot, folder=tempdir(), file_suffix = ".png",
          save_fn = ggsaver, width = 16, height = 16, units = "cm")

Pull global plotting settings before displaying plot

Description

This function extends ggiraph::girafe by allowing colour palettes to be globally specified.

Usage

girafe(
  ggobj,
  ...,
  char_limit = 200,
  label_wrap_width = 80,
  interactive = TRUE,
  palette_codes = NULL,
  priority_palette_codes = NULL,
  ncol = NULL,
  byrow = TRUE,
  colour_2nd_binary_cat = NULL,
  checked = NULL,
  not_checked = NULL,
  width_svg = NULL,
  height_svg = NULL,
  pointsize = 12
)

Arguments

ggobj

ggplot2-object.

...

Dots forwarded to ggiraph::girafe()

char_limit

Integer. Number of characters to fit on a line of plot (legend-space). Will be replaced in the future with a function that guesses this.

label_wrap_width

Integer. Number of characters fit on the axis text space before wrapping.

interactive

Boolean. Whether to produce a ggiraph-plot with interactivity (defaults to TRUE) or a static ggplot2-plot.

palette_codes

Optional list of named character vectors with names being categories and values being colours. The final character vector of the list is taken as a final resort. Defaults to NULL.

priority_palette_codes

Optional named character of categories (as names) with corresponding colours (as values) which are used first, whereupon the remaining unspecified categories are pulled from the last vector of palette_codes. Defaults to NULL.

ncol

Optional integer or NULL.

byrow

Whether to display legend keys by row or by column.

colour_2nd_binary_cat

Optional string. Color for the second category in binary checkbox plots. When set together with checked and not_checked, reverses the category order so that not_checked appears second and receives this color. Ignored if checkbox criteria are not met.

checked, not_checked

Optional string. If specified and the fill categories of the plot matches these, a special plot is returned where not_checked is hidden. Its usefulness comes in plots which are intended for checkbox responses where unchecked is not always a conscious choice.

pointsize, height_svg, width_svg

See ggiraph::girafe().

Value

If interactive, only side-effect of generating ggiraph-plot. If interactive=FALSE, returns modified ggobj.

Examples

plot <- makeme(data = ex_survey, dep = b_1)
girafe(plot)

Get Global Options for saros-functions

Description

Get Global Options for saros-functions

Usage

global_settings_get(fn_name = "makeme")

Arguments

fn_name

String, one of "make_link", "fig_height_h_barchart" and "makeme".

Value

List with options in R

Examples

global_settings_get()

Reset Global Options for saros-functions

Description

Reset Global Options for saros-functions

Usage

global_settings_reset(fn_name = "makeme")

Arguments

fn_name

String, one of "make_link", "fig_height_h_barchart" and "makeme".

Value

Invisibly returned list of old and new values.

Examples

global_settings_reset()

Get Global Options for saros-functions

Description

Get Global Options for saros-functions

Usage

global_settings_set(
  new,
  fn_name = "makeme",
  quiet = FALSE,
  null_deletes = FALSE
)

Arguments

new

List of arguments (see ?make_link(), ?makeme(), fig_height_h_barchart())

fn_name

String, one of "make_link", "fig_height_h_barchart" and "makeme".

quiet

Flag. If FALSE (default), informs about what has been set.

null_deletes

Flag. If FALSE (default), NULL elements in new become NULL elements in the option. Otherwise, the corresponding element, if present, is deleted from the option.

Value

Invisibly returned list of old and new values.

Examples

global_settings_set(new=list(digits=2))

Handle Kept and Omitted Columns for Crowds

Description

Internal helper function that processes the kept and omitted column information for crowd-based filtering and applies global hiding logic.

Usage

handle_crowd_columns(
  args,
  kept_cols_list,
  omitted_cols_list,
  kept_indep_cats_list
)

Arguments

args

List of makeme function arguments

kept_cols_list

Named list of kept column information for each crowd

omitted_cols_list

Named list of omitted variables for each crowd

Value

List containing processed crowd column information with global hiding logic applied based on hide_for_all_crowds_if_hidden_for_crowd settings


Identify Suitable Font Given Background Hex Colour

Description

Code is taken from XXX.

Usage

hex_bw(hex_code, na_colour = "#ffffff")

Arguments

hex_code

Colour in hex-format.

Value

Colours in hex-format, either black or white.


Validate and Initialize Arguments

Description

Internal helper function that finalizes the arguments list by adding resolved variable names and normalizing multi-value arguments.

Usage

initialize_arguments(data, dep_pos, indep_pos, args)

Arguments

data

Data frame being analyzed

dep_pos

Named integer vector of dependent variable positions

indep_pos

Named integer vector of independent variable positions

args

List of makeme function arguments

Value

Modified args list with additional elements:


Initialize Crowd-Based Filtering Data Structures

Description

Internal helper function that sets up the data structures needed for crowd-based filtering and processing of variables and categories.

Usage

initialize_crowd_filtering(crowd, args)

Arguments

crowd

Character vector of crowd identifiers

args

List of makeme function arguments

Details

For each crowd, this function calls keep_cols() and keep_indep_cats() to determine which variables and categories should be retained based on the various hiding criteria (NA values, sample sizes, etc.).

Value

List with three named elements:


Are All Colours in Vector Valid Colours

Description

As title says. From: (https://stackoverflow.com/a/13290832/3315962)

Usage

is_colour(x)

Arguments

x

Character vector of colours in hex-format.

Value

Logical, or error.


Is x A String?

Description

Returns TRUE if object is a character of length 1.

Usage

is_string(x)

Arguments

x

Object

Value

Logical value.


Method for Creating Saros Contents

Description

Takes the same arguments as makeme, except that dep and indep in make_content are character vectors, for ease of user-customized function programming.

Usage

make_content(type, ...)

Arguments

type

Method name

⁠scalar<character>⁠ with a class named by itself.

Optional string indicating the specific method. Occasionally useful for error messages, etc.

...

Dots

Arguments provided by makeme

Value

The returned object class depends on the type. type="*_table_html" always returns a tibble. type="*_plot_html" always returns a ggplot. type="*_docx" always returns a rdocx object if path=NULL, or has side-effect of writing docx file to disk if path is set.


Description

The file is automatically named by a hash of the object, removing the need to come up with unique file names inside a Quarto report. This has the added benefit of reducing storage needs if the objects needing linking to are identical, and all are stored in the same folder. It also allows the user to download multiple files without worrying about accidentally overwriting them.

Usage

make_link(
  data,
  folder = NULL,
  file_prefix = NULL,
  file_suffix = ".csv",
  save_fn = utils::write.csv,
  link_prefix = "[download figure data](",
  link_suffix = ")",
  ...
)

Arguments

data

Data or object

⁠<data.frame|tbl|obj>⁠

Data frame if using a tabular data save_fn, or possibly any R object, if a serializing save_fn is provided (e.g. saveRDS()).

folder

Where to store file

⁠scalar<character>⁠ // default: "." (optional)

Defaults to same folder.

file_prefix, file_suffix

File prefix/suffix

⁠scalar<character>⁠ // default: "" and ".csv" (optional)

file_suffix should include the dot before the extension.

save_fn

Saving function

function // default: utils::write.csv

Can be any saving/writing function. However, first argument must be the object to be saved, and the second must be the path. Hence, ggplot2::ggsave() must be wrapped in another function with filename and object swapped. See ggsaver() for an example of such a wrapper function.

link_prefix, link_suffix

Link prefix/suffix

⁠scalar<character>⁠ // default: "[download data](" and ")"

The stuff that is returned.

...

Dynamic dots

<dynamic-dots>

Arguments forwarded to the corresponding functions that create the elements.

Value

String.

Examples

make_link(mtcars, folder = tempdir())

Save data to a file and return a Markdown link

Description

The file is automatically named by a hash of the object, removing the need to come up with unique file names inside a Quarto report. This has the added benefit of reducing storage needs if the objects needing linking to are identical, and all are stored in the same folder. It also allows the user to download multiple files without worrying about accidentally overwriting them.

Usage

## Default S3 method:
make_link(
  data,
  ...,
  folder = NULL,
  file_prefix = NULL,
  file_suffix = ".csv",
  save_fn = utils::write.csv,
  link_prefix = "[download figure data](",
  link_suffix = ")"
)

Arguments

data

Data or object

⁠<data.frame|tbl|obj>⁠

Data frame if using a tabular data save_fn, or possibly any R object, if a serializing save_fn is provided (e.g. saveRDS()).

...

Dynamic dots

<dynamic-dots>

Arguments forwarded to the corresponding functions that create the elements.

folder

Where to store file

⁠scalar<character>⁠ // default: "." (optional)

Defaults to same folder.

file_prefix, file_suffix

File prefix/suffix

⁠scalar<character>⁠ // default: "" and ".csv" (optional)

file_suffix should include the dot before the extension.

save_fn

Saving function

function // default: utils::write.csv

Can be any saving/writing function. However, first argument must be the object to be saved, and the second must be the path. Hence, ggplot2::ggsave() must be wrapped in another function with filename and object swapped. See ggsaver() for an example of such a wrapper function.

link_prefix, link_suffix

Link prefix/suffix

⁠scalar<character>⁠ // default: "[download data](" and ")"

The stuff that is returned.

Value

String.

Examples

make_link(mtcars, folder = tempdir())

Save data to a file and return a Markdown link

Description

The file is automatically named by a hash of the object, removing the need to come up with unique file names inside a Quarto report. This has the added benefit of reducing storage needs if the objects needing linking to are identical, and all are stored in the same folder. It also allows the user to download multiple files without worrying about accidentally overwriting them.

Usage

## S3 method for class 'list'
make_link(
  data,
  ...,
  folder = NULL,
  file_prefix = NULL,
  file_suffix = ".csv",
  save_fn = utils::write.csv,
  link_prefix = "[download figure data](",
  link_suffix = ")",
  separator_list_items = ". "
)

Arguments

data

Data or object

⁠<data.frame|tbl|obj>⁠

Data frame if using a tabular data save_fn, or possibly any R object, if a serializing save_fn is provided (e.g. saveRDS()).

...

Dynamic dots

<dynamic-dots>

Arguments forwarded to the corresponding functions that create the elements.

folder

Where to store file

⁠scalar<character>⁠ // default: "." (optional)

Defaults to same folder.

file_prefix, file_suffix

File prefix/suffix

⁠scalar<character>⁠ // default: "" and ".csv" (optional)

file_suffix should include the dot before the extension.

save_fn

Saving function

function // default: utils::write.csv

Can be any saving/writing function. However, first argument must be the object to be saved, and the second must be the path. Hence, ggplot2::ggsave() must be wrapped in another function with filename and object swapped. See ggsaver() for an example of such a wrapper function.

link_prefix, link_suffix

Link prefix/suffix

⁠scalar<character>⁠ // default: "[download data](" and ")"

The stuff that is returned.

separator_list_items

Separator string between multiple list items

⁠scalar<character>⁠ // default: ". " (optional)


Embed Interactive Plot of Various Kinds Using Tidyselect Syntax

Description

This function allows embedding of interactive or static plots based on various types of data using tidyselect syntax for variable selection.

Usage

makeme(
  data,
  dep = tidyselect::everything(),
  indep = NULL,
  type = c("cat_plot_html", "int_plot_html", "cat_table_html", "int_table_html",
    "sigtest_table_html", "cat_prop_plot_docx", "cat_freq_plot_docx", "int_plot_docx"),
  ...,
  require_common_categories = TRUE,
  crowd = c("all"),
  mesos_var = NULL,
  mesos_group = NULL,
  simplify_output = TRUE,
  hide_for_crowd_if_all_na = TRUE,
  hide_for_crowd_if_valid_n_below = 0,
  hide_for_crowd_if_category_k_below = 2,
  hide_for_crowd_if_category_n_below = 0,
  hide_for_crowd_if_cell_n_below = 0,
  hide_for_all_crowds_if_hidden_for_crowd = NULL,
  hide_indep_cat_for_all_crowds_if_hidden_for_crowd = FALSE,
  add_n_to_dep_label = FALSE,
  add_n_to_indep_label = FALSE,
  add_n_to_label = FALSE,
  add_n_to_category = FALSE,
  totals = FALSE,
  categories_treated_as_na = NULL,
  label_separator = " - ",
  error_on_duplicates = TRUE,
  showNA = c("ifany", "always", "never"),
  data_label = c("percentage_bare", "percentage", "proportion", "count", "mean",
    "median"),
  data_label_position = c("center", "bottom", "top", "above"),
  html_interactive = TRUE,
  hide_axis_text_if_single_variable = TRUE,
  hide_label_if_prop_below = 0.01,
  inverse = FALSE,
  vertical = FALSE,
  digits = 0,
  data_label_decimal_symbol = ".",
  x_axis_label_width = 25,
  strip_width = 25,
  sort_dep_by = ".variable_position",
  sort_indep_by = ".factor_order",
  sort_by = NULL,
  descend = TRUE,
  descend_indep = FALSE,
  labels_always_at_top = NULL,
  labels_always_at_bottom = NULL,
  table_wide = TRUE,
  table_main_question_as_header = FALSE,
  n_categories_limit = 12,
  translations = list(last_sep = " and ", table_heading_N = "Total (N)",
    table_heading_data_label = "%", add_n_to_dep_label_prefix = " (N = ",
    add_n_to_dep_label_suffix = ")", add_n_to_indep_label_prefix = " (N = ",
    add_n_to_indep_label_suffix = ")", add_n_to_label_prefix = " (N = ",
    add_n_to_label_suffix = ")", add_n_to_category_prefix = " (N = [",
    add_n_to_category_infix = ",", add_n_to_category_suffix = "])", by_total =
    "Everyone", sigtest_variable_header_1 = "Var 1", sigtest_variable_header_2 = "Var 2",
    crowd_all = "All", 
     crowd_target = "Target", crowd_others = "Others"),
  plot_height = 15,
  colour_palette = NULL,
  colour_2nd_binary_cat = "#ffffff",
  colour_na = "grey",
  label_font_size = 6,
  main_font_size = 6,
  strip_font_size = 6,
  legend_font_size = 6,
  font_family = "sans",
  path = NULL,
  docx_template = NULL
)

Arguments

data

Your data.frame/tibble or srvyr-object (experimental)

data.frame // required

The data to be used for plotting.

dep, indep

Variable selections

<tidyselect> // Default: NULL, meaning everything for dep, nothing for indep.

Columns in data. dep is compulsory.

type

Kind of output

⁠scalar<character>⁠ // default: "cat_plot_html" (optional)

For a list of registered types in your session, use get_makeme_types().

...

Dynamic dots

<dynamic-dots>

Arguments forwarded to the corresponding functions that create the elements.

require_common_categories

Check common categories

⁠scalar<logical>⁠ // default: TRUE (optional)

Whether to check if all items share common categories.

crowd

Which group(s) to display results for

⁠vector<character>⁠ // default: c("target", "others", "all") (optional)

Choose whether to produce results for target (mesos) group, others, all, or combinations of these.

mesos_var

Variable in data indicating groups to tailor reports for

⁠scalar<character>⁠ // default: NULL (optional)

Column name in data indicating the groups for which mesos reports will be produced.

mesos_group

⁠scalar<character>⁠ // default: NULL (optional)

String, target group.

simplify_output

⁠scalar<logical>⁠ // default: TRUE

If TRUE, a list output with a single output element will return the element itself, whereas list with multiple elements will return the list.

hide_for_crowd_if_all_na

Hide variable from output if containing all NA

⁠scalar<boolean>⁠ // default: TRUE

Whether to remove all variables (in particular useful for mesos) if all values are NA

hide_for_crowd_if_valid_n_below

Hide variable if variable has < n observations

⁠scalar<integer>⁠ // default: 0

Whether to hide a variable for a crowd if variable contains fewer than n observations (always ignoring NA).

hide_for_crowd_if_category_k_below

Hide variable if < k categories

⁠scalar<integer>⁠ // default: 2

Whether to hide a variable for a crowd if variable contains fewer than k used categories (always ignoring NA). Defaults to 2 because a unitary plot/table is rarely informative.

hide_for_crowd_if_category_n_below

Hide variable if having a category with < n observations

⁠scalar<integer>⁠ // default: 0

Whether to hide a variable for a crowd if variable contains a category with less than n observations (ignoring NA) Cells with a 0 count is not considered as these are usually not a problem for anonymity.

hide_for_crowd_if_cell_n_below

Hide variable if having a cell with < n

⁠scalar<integer>⁠ // default: 0

Whether to hide a variable for a crowd if the combination of dep-indep results in a cell with less than n observations (ignoring NA). Cells with a 0 count is not considered as these are usually not a problem for anonymity.

hide_for_all_crowds_if_hidden_for_crowd

Conditional hiding

⁠scalar<character>⁠ // default: NULL (optional)

Select one of the crowd output groups. If selected, will hide a variable across all crowd-outputs if it for some reason is not displayed for hide_for_all_if_hidden_for_crowd. For instance, say:

⁠crowd = c("target", "others"), hide_variable_if_all_na = TRUE,⁠ hide_for_all_if_hidden_for_crowd = "target"

will hide variables from both target and others-outputs if all are NA in the target-group.

hide_indep_cat_for_all_crowds_if_hidden_for_crowd

Conditionally hide independent categories

⁠scalar<logical>⁠ // default: FALSE

If hide_for_all_crowds_if_hidden_for_crowd is specified, should categories of the indep variable(s) be hidden for a crowd if it does not exist for the crowds specified in hide_for_all_crowds_if_hidden_for_crowd? This is useful when e.g. indep is academic disciplines, mesos_var is institutions, and a specific institution is not interested in seeing academic disciplines they do not offer themselves.

add_n_to_dep_label, add_n_to_indep_label

Add N= to the variable label

⁠scalar<logical>⁠ // default: FALSE (optional)

For some plots and tables it is useful to attach the "N=" to the end of the label of the dependent and/or independent variable. Whether it is N or N_valid depends on your showNA-setting. See also translations$add_n_to_dep_label_prefix, translations$add_n_to_dep_label_suffix, translations$add_n_to_indep_label_prefix, translations$add_n_to_indep_label_suffix.

add_n_to_label

Add N= to the variable label of both dep and indep

⁠scalar<logical>⁠ // default: FALSE (optional)

For some plots and tables it is useful to attach the "N=" to the end of the label. Whether it is N or N_valid depends on your showNA-setting. See also translations$add_n_to_label_prefix and translations$add_n_to_label_suffix.

add_n_to_category

Add N= to the category

⁠scalar<logical>⁠ // default: FALSE (optional)

For some plots and tables it is useful to attach the "N=" to the end of the category. This will likely produce a range across the variables, hence an infix (comma) between the minimum and maximum can be specified. Whether it is N or N_valid depends on your showNA-setting. See also translations$add_n_to_category_prefix, translations$add_n_to_category_infix, and translations$add_n_to_category_suffix.

totals

Include totals

⁠scalar<logical>⁠ // default: FALSE (optional)

Whether to include totals in the output.

categories_treated_as_na

NA categories

⁠vector<character>⁠ // default: NULL (optional)

Categories that should be treated as NA.

label_separator

How to separate main question from sub-question

⁠scalar<character>⁠ // default: NULL (optional)

Separator for main question from sub-question.

error_on_duplicates

Error or warn on duplicate labels

⁠scalar<logical>⁠ // default: TRUE (optional)

Whether to abort (TRUE) or warn (FALSE) if the same label (suffix) is used across multiple variables.

showNA

Show NA categories

⁠vector<character>⁠ // default: c("ifany", "always", "never") (optional)

Choose whether to show NA categories in the results.

data_label

Data label

⁠scalar<character>⁠ // default: "proportion" (optional)

One of "proportion", "percentage", "percentage_bare", "count", "mean", or "median".

data_label_position

Data label position

⁠scalar<character>⁠ // default: "center" (optional)

Position of data labels on bars. One of "center" (middle of bar), "bottom" (bottom but inside bar), "top" (top but inside bar), or "above" (above bar outside).

html_interactive

Toggle interactive plot

⁠scalar<logical>⁠ // default: TRUE (optional)

Whether the plot is to be interactive (ggiraph) or static (ggplot2).

hide_axis_text_if_single_variable

Hide y-axis text if just a single variable

⁠scalar<boolean>⁠ // default: FALSE (optional)

Whether to hide text on the y-axis label if just a single variable.

hide_label_if_prop_below

Hide label threshold

⁠scalar<numeric>⁠ // default: NULL (optional)

Whether to hide label if below this value.

inverse

Flag to swap x-axis and faceting

⁠scalar<logical>⁠ // default: FALSE (optional)

If TRUE, swaps x-axis and faceting.

vertical

Display plot vertically

⁠scalar<logical>⁠ // default: FALSE (optional)

If TRUE, display plot vertically.

digits

Decimal places

⁠scalar<integer>⁠ // default: 0L (optional)

Number of decimal places.

data_label_decimal_symbol

Decimal symbol

⁠scalar<character>⁠ // default: "." (optional)

Decimal marker, some might prefer a comma ',' or something else entirely.

x_axis_label_width, strip_width

Label width of x-axis and strip texts in plots

⁠scalar<integer>⁠ // default: 20 (optional)

Width of the labels used for the categorical column names in x-axis texts and strip texts.

sort_dep_by

What to sort dependent variables by

⁠vector<character>⁠ // default: ".variable_position" (optional)

Sort dependent variables in output. When using indep-argument, sorting differs between ordered factors and unordered factors: Ordering of ordered factors is always respected in output (their levels define the base order). Unordered factors will be reordered by sort_dep_by.

NULL or ".variable_position"

Sort by variable position in the supplied data frame (default).

".variable_label"

Sort by the variable labels.

".variable_name"

Sort by the variable names.

".top"

The proportion for the highest category available in the variable.

".upper"

The sum of the proportions for the categories above the middle category.

".mid_upper"

The sum of the proportions for the categories including and above the middle category.

".mid_lower"

The sum of the proportions for the categories including and below the middle category.

".lower"

The sum of the proportions for the categories below the middle category.

".bottom"

The proportions for the lowest category available in the variable.

sort_indep_by

What to sort independent variable categories by

⁠vector<character>⁠ // default: ".factor_order" (optional)

Sort independent variable categories in output. When ".factor_order", preserves the original factor level order for the independent variable. Passing NULL is accepted and treated as ".factor_order".

NULL

No sorting - preserves original factor level order (default).

".top"

The proportion for the highest category available.

".upper"

The sum of the proportions for the categories above the middle category.

".mid_upper"

The sum of the proportions for the categories including and above the middle category.

".mid_lower"

The sum of the proportions for the categories including and below the middle category.

".lower"

The sum of the proportions for the categories below the middle category.

".bottom"

The proportions for the lowest category available.

character()

Character vector of category labels to sum together.

sort_by

What to sort output by (legacy)

⁠vector<character>⁠ // default: NULL (optional)

DEPRECATED: Use sort_dep_by and sort_indep_by instead for clearer control. When specified, this parameter will be used for both dependent and independent sorting. If NULL (default), dependent variables will be sorted by .variable_position.

NULL

Uses .variable_position for dependent variables, no sorting for independent.

".top"

The proportion for the highest category available in the variable.

".upper"

The sum of the proportions for the categories above the middle category.

".mid_upper"

The sum of the proportions for the categories including and above the middle category.

".mid_lower"

The sum of the proportions for the categories including and below the middle category.

".lower"

The sum of the proportions for the categories below the middle category.

".bottom"

The proportions for the lowest category available in the variable.

".variable_label"

Sort by the variable labels.

".variable_name"

Sort by the variable names.

".variable_position"

Sort by the variable position in the supplied data frame.

".by_group"

The groups of the by argument.

character()

Character vector of category labels to sum together.

descend

Sorting order

⁠scalar<logical>⁠ // default: FALSE (optional)

Reverse sorting of sort_by in figures and tables. Works with both ordered and unordered factors - for ordered factors, it reverses the display order while preserving the inherent level ordering. See arrange_section_by for sorting of report sections.

descend_indep

Sorting order for independent variables

⁠scalar<logical>⁠ // default: FALSE (optional)

Reverse sorting of sort_indep_by in figures and tables. Works with both ordered and unordered factors - for ordered factors, it reverses the display order while preserving the inherent level ordering. See arrange_section_by for sorting of report sections.

labels_always_at_top, labels_always_at_bottom

Top/bottom variables

⁠vector<character>⁠ // default: NULL (optional)

Column names in data that should always be placed at the top or bottom of figures/tables.

table_wide

Pivot table wider

⁠scalar<logical>⁠ // default: FALSE (optional)

Whether to pivot table wider.

table_main_question_as_header

Table main question as header

⁠scalar<logical>⁠ // default: FALSE (optional)

Whether to include the main question as a header in the table.

n_categories_limit

Limit for cat_table_ wide format

⁠scalar<integer>⁠ // default: 12 (optional)

If there are more than this number of categories in the categorical variable, cat_table_* will have a long format instead of wide format.

translations

Localize your output

⁠list<character>⁠

A list of translations where the name is the code and the value is the translation. See the examples.

plot_height

DOCX-setting

⁠scalar<numeric>⁠ // default: 12 (optional)

DOCX plots need a height, which currently cannot be set easily with a Quarto chunk option.

colour_palette

Colour palette

⁠vector<character>⁠ // default: NULL (optional)

Must contain at least the number of unique values (including missing) in the data set.

colour_2nd_binary_cat

Colour for second binary category

⁠scalar<character>⁠ // default: "#ffffff" (optional)

Colour for the second category in binary variables. Often useful to hide this.

colour_na

Colour for NA category

⁠scalar<character>⁠ // default: NULL (optional)

Colour as a single string for NA values, if showNA is "ifany" or "always".

main_font_size, label_font_size, strip_font_size, legend_font_size

Font sizes

⁠scalar<integer>⁠ // default: 6 (optional)

ONLY FOR DOCX-OUTPUT. Other output is adjusted using e.g. ggplot2::theme() or set with a global theme (ggplot2::set_theme()). Font sizes for general text (6), data label text (3), strip text (6) and legend text (6).

font_family

Font family

⁠scalar<character>⁠ // default: "sans" (optional)

Word font family. See officer::fp_text.

path

Output path for DOCX

⁠scalar<character>⁠ // default: NULL (optional)

Path to save docx-output.

docx_template

Filename or rdocx object

⁠scalar<character>|<rdocx>-object⁠ // default: NULL (optional)

Can be either a valid character path to a reference Word file, or an existing rdocx-object in memory.

Value

ggplot-object, optionally an extended ggplot object with ggiraph features.

Examples

makeme(
  data = ex_survey,
  dep = b_1:b_2
)
makeme(
  data = ex_survey,
  dep = b_1:b_3, indep = c(x1_sex, x2_human),
  type = "sigtest_table_html"
)
makeme(
  data = ex_survey,
  dep = p_1:p_4, indep = x2_human,
  type = "cat_table_html"
)
makeme(
  data = ex_survey,
  dep = c_1:c_2, indep = x1_sex,
  type = "int_table_html"
)
makeme(
  data = ex_survey,
  dep = b_1:b_2,
  crowd = c("target", "others"),
  mesos_var = "f_uni",
  mesos_group = "Uni of A"
)

Provides a range (or single value) for N in data, given dep and indep

Description

Provides a range (or single value) for N in data, given dep and indep

Usage

n_range(
  data,
  dep,
  indep = NULL,
  mesos_var = NULL,
  mesos_group = NULL,
  glue_template_1 = "{n}",
  glue_template_2 = "[{n[1]}-{n[2]}]"
)

Arguments

data

Dataset

dep, indep

Tidyselect syntax

mesos_var

Optional, NULL or string specifying name of variable used to split dataset.

mesos_group

Optional, NULL or string specifying value in mesos_var indicating the target group.

glue_template_1, glue_template_2

String, for the case of a single value (1) or a range with minimum-maximum of values (2).

Value

String.

Examples

n_range(data = ex_survey, dep = b_1:b_3, indep = x1_sex)

Provides a range (or single value) for N in a ggplot2-object from makeme()

Description

Provides a range (or single value) for N in a ggplot2-object from makeme()

Usage

n_range2(ggobj, glue_template_1 = "{n}", glue_template_2 = "[{n[1]}-{n[2]}]")

Arguments

ggobj

A ggplot2-object.

glue_template_1, glue_template_2

String, for the case of a single value (1) or a range with minimum-maximum of values (2).

Value

String.

Examples

n_range2(makeme(data = ex_survey, dep = b_1:b_3))

Obtain range of N for a given data set and other settings.

Description

Obtain range of N for a given data set and other settings.

Usage

n_rng(
  data,
  dep,
  indep = NULL,
  crowd = "all",
  mesos_var = NULL,
  mesos_group = NULL,
  glue_template_1 = "{n}",
  glue_template_2 = "[{n[1]}-{n[2]}]"
)

Arguments

data

Dataset

dep, indep

Character vector, names of (in)dependent variables

crowd

String, one of "all", "target" or "others".

mesos_var

Optional, NULL or string specifying name of variable used to split dataset.

mesos_group

Optional, NULL or string specifying value in mesos_var indicating the target group.

glue_template_1, glue_template_2

String, for the case of a single value (1) or a range with minimum-maximum of values (2).

Value

Always a string.


Obtain range of N for a given ggobj.

Description

Obtain range of N for a given ggobj.

Usage

n_rng2(ggobj, glue_template_1 = "{n}", glue_template_2 = "[{n[1]}-{n[2]}]")

Arguments

ggobj

A ggplot2-object.

glue_template_1, glue_template_2

String, for the case of a single value (1) or a range with minimum-maximum of values (2).

Value

Always a string.


Normalize Multi-Choice Arguments to Single Values

Description

Internal helper function that ensures makeme arguments that might be vectors are normalized to single values by taking the first element.

Usage

normalize_makeme_arguments(args)

Arguments

args

List of makeme function arguments

Value

Modified args list with normalized single-value arguments:


Post-process Makeme Data (Legacy)

Description

Legacy function that combines both factor level processing and binary category color processing. Use the individual functions for new code.

Usage

post_process_makeme_data(
  data,
  indep = NULL,
  showNA = "never",
  colour_2nd_binary_cat = NULL
)

Arguments

data

Data frame containing the data

indep

Character string naming the independent variable (or NULL)

showNA

Character indicating how to handle NA values

colour_2nd_binary_cat

Color specification for second binary category

Value

Modified data frame


Process All Crowds and Generate Output

Description

Internal helper function that iterates through all crowd identifiers and generates the appropriate output for each crowd.

Usage

process_all_crowds(
  args,
  omitted_cols_list,
  kept_indep_cats_list,
  data,
  mesos_var,
  mesos_group,
  ...
)

Arguments

args

Validated list of makeme function arguments

omitted_cols_list

Named list of omitted variables for each crowd

kept_indep_cats_list

Named list of kept independent categories for each crowd

data

Data frame being analyzed

mesos_var

Mesos-level grouping variable

mesos_group

Specific mesos group identifier

...

Additional arguments passed to process_crowd_data

Value

Named list of crowd outputs:


Process Binary Category Colors

Description

Reverses the .category variable for binary categories when a special color condition is met. This is specific to categorical plot functionality.

Usage

process_binary_category_colors(
  data,
  showNA = "never",
  colour_2nd_binary_cat = NULL
)

Arguments

data

Data frame containing the data with .category column

showNA

Character indicating how to handle NA values

colour_2nd_binary_cat

Color specification for second binary category

Value

Modified data frame with potentially reversed .category levels


Process categorical data for showNA settings

Description

Handle NA categories based on showNA parameter for categorical tables

Usage

process_categorical_na(data, dots)

Arguments

data

Data frame with .category column

dots

List with showNA and indep settings

Value

Processed data frame


Process Data for a Single Crowd

Description

Internal helper function that handles the complete processing pipeline for a single crowd, from data filtering to final output generation.

Usage

process_crowd_data(
  crwd,
  args,
  omitted_cols_list,
  kept_indep_cats_list,
  data,
  mesos_var,
  mesos_group,
  ...
)

Arguments

crwd

Character string identifying the current crowd

args

List of makeme function arguments

omitted_cols_list

Named list of omitted variables for each crowd

kept_indep_cats_list

Named list of kept independent categories for each crowd

data

Data frame being analyzed

mesos_var

Mesos-level grouping variable

mesos_group

Specific mesos group identifier

...

Additional arguments passed to data summarization functions

Details

Complete processing pipeline:

Value

Final output object for the crowd, or NULL if no data remains:


Process Crowd Settings

Description

Internal helper function that reorders the crowd array to ensure priority crowds (specified in hide_for_all_crowds_if_hidden_for_crowd) are processed first.

Usage

process_crowd_settings(args)

Arguments

args

List of makeme function arguments

Value

Modified args list with reordered crowd vector:


Process Independent Categories for Global Hiding Logic

Description

Internal helper function that applies global hiding logic to independent variable categories based on the hide_for_all_crowds_if_hidden_for_crowd setting.

Usage

process_global_indep_categories(
  kept_indep_cats_list,
  hide_for_all_crowds_if_hidden_for_crowd
)

Arguments

kept_indep_cats_list

Named list of kept independent categories for each crowd

hide_for_all_crowds_if_hidden_for_crowd

Character vector of crowd identifiers that determine global category exclusions

Value

Modified kept_indep_cats_list with global hiding logic applied:


Process Independent Variable Factor Levels

Description

Reverses factor levels for independent variables, but only for unordered factors. Preserves the natural ordering of ordered factors.

Usage

process_indep_factor_levels(data, indep = NULL)

Arguments

data

Data frame containing the data

indep

Character string naming the independent variable (or NULL)

Value

Modified data frame with reversed factor levels for unordered factors


Process main question and extract suffixes

Description

Handle label separation and suffix extraction for table functions

Usage

process_main_question_and_suffixes(data, dots, col_basis)

Arguments

data

Data frame to process

dots

List from rlang::list2(...)

col_basis

Current column basis (.variable_label or .variable_name)

Value

List with processed data, main_question, and updated col_basis


Process Output Results

Description

Internal helper function that performs final processing of makeme output, including crowd renaming, NULL removal, and output simplification.

Usage

process_output_results(out, args)

Arguments

out

Named list of crowd outputs from process_all_crowds

args

List of makeme function arguments (for translations and simplify_output)

Value

Processed output in final form:


Process data with standard table operations

Description

Apply column selection, renaming, and independent variable handling

Usage

process_table_data(
  data,
  col_basis,
  indep_vars = NULL,
  indep_label = character(),
  main_question = "",
  use_header = FALSE,
  stat_columns = NULL,
  column_mappings = NULL
)

Arguments

data

Data frame to process

col_basis

Column basis for variables

indep_vars

Independent variable columns

indep_label

Independent variable labels

main_question

Main question for headers

use_header

Whether to use main question as header

stat_columns

Statistical columns to include

column_mappings

Additional column mappings

Value

Processed data frame


Rename Crowd Outputs

Description

Internal helper function that renames crowd identifiers in the output based on provided translations.

Usage

rename_crowd_outputs(out, translations)

Arguments

out

Named list of crowd outputs

translations

Named list of translation mappings for crowd identifiers

Value

Modified out list with crowd names translated:


Reorder Crowd Array Based on Hide Settings

Description

Internal helper function that reorders the crowd array to prioritize crowds specified in hide_for_all_crowds_if_hidden_for_crowd, ensuring they are processed first to determine variable exclusions early.

Usage

reorder_crowd_array(crowd, hide_for_all_crowds_if_hidden_for_crowd)

Arguments

crowd

Character vector of crowd identifiers

hide_for_all_crowds_if_hidden_for_crowd

Character vector of crowd identifiers that should be processed first to determine global exclusions

Value

Character vector with reordered crowd identifiers:


Code-snippets copied and modified from tidytext-package https://github.com/juliasilge/tidytext/blob/main/R/reorder_within.R

Description

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

Usage

reorder_within(x, by, within, fun = mean, sep = "___", ...)

Arguments

x

Vector

by

Vector

within

Vector (factor)

fun

Function, defaults to the mean

sep

String, separator

...

Dots

Details

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Source

"Original: Ordering categories within ggplot2 Facets" by Tyler Rinker: https://trinkerrstuff.wordpress.com/2016/12/23/ordering-categories-within-ggplot2-facets/ Based on https://opensource.org/licenses/MIT Copyright (c) 2017, Julia Silge and David Robinson


Resolve Variable Overlaps Between Dependent and Independent Variables

Description

Internal helper function that handles cases where variables are selected for both dependent and independent roles. Automatically removes overlapping variables from the dependent list and provides user feedback.

Usage

resolve_variable_overlaps(dep, indep)

Arguments

dep

Character vector of dependent variable names

indep

Character vector of independent variable names

Details

If overlapping variables are found:

Value

Character vector of dependent variable names with overlaps removed


Round numeric statistics

Description

Apply rounding to numeric statistical columns

Usage

round_numeric_stats(data, digits)

Arguments

data

Data frame to process

digits

Number of decimal places

Value

Data frame with rounded numeric columns


Set factor levels based on order columns (for backward compatibility)

Description

Set factor levels based on order columns (for backward compatibility)

Usage

set_factor_levels_from_order(data)

Arguments

data

Dataset with order columns

Value

Dataset with factor levels set according to order


Setup and Validate Makeme Arguments

Description

Internal helper function that performs final argument setup and validation before processing. Consolidates variable resolution, normalization, and validation.

Usage

setup_and_validate_makeme_args(args, data, dep_pos, indep_pos, indep)

Arguments

args

List of makeme function arguments

data

Data frame being analyzed

dep_pos

Named integer vector of dependent variable positions

indep_pos

Named integer vector of independent variable positions

indep

Independent variable selection (for validation)

Value

Modified and validated args list ready for processing:


Setup table data from dots

Description

Common setup logic for table functions including data extraction and early return

Usage

setup_table_data(dots)

Arguments

dots

List from rlang::list2(...)

Value

List with data and should_return flag


Shift labels_always_at

Description

Shift labels_always_at

Usage

shift_labels_always_at(data, labels_always_at = NULL, after = Inf)

Arguments

data

Dataset

labels_always_at

Labels to move to bottom or top

after

Position to move labels to (0 = top, Inf = bottom)

Value

Dataset with data$.variable_label adjusted


Apply string wrapping to variables (character or factor)

Description

A utility function that applies string wrapping to both character and factor variables, preserving factor structure while wrapping the labels.

Usage

strip_wrap_var(x, width = Inf)

Arguments

x

Variable to wrap (character or factor)

width

Maximum width for wrapping

Value

Modified variable with wrapped text


Given Ordered Integer Vector, Return Requested Set.

Description

Useful for identifying which categories are to be collected.

Usage

subset_vector(
  vec,
  set = c(".top", ".upper", ".mid_upper", ".lower", ".mid_lower", ".bottom", ".spread"),
  spread_n = NULL,
  sort = FALSE
)

Arguments

vec

A vector of any type.

set

A character string, one of c(".top", ".upper", ".mid_upper", ".lower", ".mid_lower", ".bottom")

spread_n

The number of values to extract when set is "spread".

sort

Whether to sort the output, defaults to FALSE.

Value

Selected set of vector.


Summarize a survey dataset for use in tables and graphs

Description

Summarize a survey dataset for use in tables and graphs

Usage

summarize_cat_cat_data(
  data,
  dep = colnames(data),
  indep = NULL,
  ...,
  showNA = c("ifany", "always", "never"),
  totals = FALSE,
  sort_by = ".upper",
  sort_dep_by = NULL,
  sort_indep_by = ".factor_order",
  data_label = c("percentage_bare", "percentage", "proportion", "count", "mean",
    "median"),
  digits = 0,
  add_n_to_dep_label = FALSE,
  add_n_to_indep_label = FALSE,
  add_n_to_label = FALSE,
  add_n_to_category = FALSE,
  hide_label_if_prop_below = 0.01,
  data_label_decimal_symbol = ".",
  categories_treated_as_na = NULL,
  label_separator = NULL,
  descend = FALSE,
  descend_indep = FALSE,
  labels_always_at_bottom = NULL,
  labels_always_at_top = NULL,
  translations = list(),
  call = rlang::caller_env()
)

Arguments

data

Your data.frame/tibble or srvyr-object (experimental)

data.frame // required

The data to be used for plotting.

dep, indep

Variable selections

<tidyselect> // Default: NULL, meaning everything for dep, nothing for indep.

Columns in data. dep is compulsory.

...

Dynamic dots

<dynamic-dots>

Arguments forwarded to the corresponding functions that create the elements.

showNA

Show NA categories

⁠vector<character>⁠ // default: c("ifany", "always", "never") (optional)

Choose whether to show NA categories in the results.

totals

Include totals

⁠scalar<logical>⁠ // default: FALSE (optional)

Whether to include totals in the output.

sort_by

What to sort output by (legacy)

⁠vector<character>⁠ // default: NULL (optional)

DEPRECATED: Use sort_dep_by and sort_indep_by instead for clearer control. When specified, this parameter will be used for both dependent and independent sorting. If NULL (default), dependent variables will be sorted by .variable_position.

NULL

Uses .variable_position for dependent variables, no sorting for independent.

".top"

The proportion for the highest category available in the variable.

".upper"

The sum of the proportions for the categories above the middle category.

".mid_upper"

The sum of the proportions for the categories including and above the middle category.

".mid_lower"

The sum of the proportions for the categories including and below the middle category.

".lower"

The sum of the proportions for the categories below the middle category.

".bottom"

The proportions for the lowest category available in the variable.

".variable_label"

Sort by the variable labels.

".variable_name"

Sort by the variable names.

".variable_position"

Sort by the variable position in the supplied data frame.

".by_group"

The groups of the by argument.

character()

Character vector of category labels to sum together.

sort_dep_by

What to sort dependent variables by

⁠vector<character>⁠ // default: ".variable_position" (optional)

Sort dependent variables in output. When using indep-argument, sorting differs between ordered factors and unordered factors: Ordering of ordered factors is always respected in output (their levels define the base order). Unordered factors will be reordered by sort_dep_by.

NULL or ".variable_position"

Sort by variable position in the supplied data frame (default).

".variable_label"

Sort by the variable labels.

".variable_name"

Sort by the variable names.

".top"

The proportion for the highest category available in the variable.

".upper"

The sum of the proportions for the categories above the middle category.

".mid_upper"

The sum of the proportions for the categories including and above the middle category.

".mid_lower"

The sum of the proportions for the categories including and below the middle category.

".lower"

The sum of the proportions for the categories below the middle category.

".bottom"

The proportions for the lowest category available in the variable.

sort_indep_by

What to sort independent variable categories by

⁠vector<character>⁠ // default: ".factor_order" (optional)

Sort independent variable categories in output. When ".factor_order", preserves the original factor level order for the independent variable. Passing NULL is accepted and treated as ".factor_order".

NULL

No sorting - preserves original factor level order (default).

".top"

The proportion for the highest category available.

".upper"

The sum of the proportions for the categories above the middle category.

".mid_upper"

The sum of the proportions for the categories including and above the middle category.

".mid_lower"

The sum of the proportions for the categories including and below the middle category.

".lower"

The sum of the proportions for the categories below the middle category.

".bottom"

The proportions for the lowest category available.

character()

Character vector of category labels to sum together.

data_label

Data label

⁠scalar<character>⁠ // default: "proportion" (optional)

One of "proportion", "percentage", "percentage_bare", "count", "mean", or "median".

digits

Decimal places

⁠scalar<integer>⁠ // default: 0L (optional)

Number of decimal places.

add_n_to_dep_label, add_n_to_indep_label

Add N= to the variable label

⁠scalar<logical>⁠ // default: FALSE (optional)

For some plots and tables it is useful to attach the "N=" to the end of the label of the dependent and/or independent variable. Whether it is N or N_valid depends on your showNA-setting. See also translations$add_n_to_dep_label_prefix, translations$add_n_to_dep_label_suffix, translations$add_n_to_indep_label_prefix, translations$add_n_to_indep_label_suffix.

add_n_to_label

Add N= to the variable label of both dep and indep

⁠scalar<logical>⁠ // default: FALSE (optional)

For some plots and tables it is useful to attach the "N=" to the end of the label. Whether it is N or N_valid depends on your showNA-setting. See also translations$add_n_to_label_prefix and translations$add_n_to_label_suffix.

add_n_to_category

Add N= to the category

⁠scalar<logical>⁠ // default: FALSE (optional)

For some plots and tables it is useful to attach the "N=" to the end of the category. This will likely produce a range across the variables, hence an infix (comma) between the minimum and maximum can be specified. Whether it is N or N_valid depends on your showNA-setting. See also translations$add_n_to_category_prefix, translations$add_n_to_category_infix, and translations$add_n_to_category_suffix.

hide_label_if_prop_below

Hide label threshold

⁠scalar<numeric>⁠ // default: NULL (optional)

Whether to hide label if below this value.

data_label_decimal_symbol

Decimal symbol

⁠scalar<character>⁠ // default: "." (optional)

Decimal marker, some might prefer a comma ',' or something else entirely.

categories_treated_as_na

NA categories

⁠vector<character>⁠ // default: NULL (optional)

Categories that should be treated as NA.

label_separator

How to separate main question from sub-question

⁠scalar<character>⁠ // default: NULL (optional)

Separator for main question from sub-question.

descend

Sorting order

⁠scalar<logical>⁠ // default: FALSE (optional)

Reverse sorting of sort_by in figures and tables. Works with both ordered and unordered factors - for ordered factors, it reverses the display order while preserving the inherent level ordering. See arrange_section_by for sorting of report sections.

descend_indep

Sorting order for independent variables

⁠scalar<logical>⁠ // default: FALSE (optional)

Reverse sorting of sort_indep_by in figures and tables. Works with both ordered and unordered factors - for ordered factors, it reverses the display order while preserving the inherent level ordering. See arrange_section_by for sorting of report sections.

labels_always_at_top, labels_always_at_bottom

Top/bottom variables

⁠vector<character>⁠ // default: NULL (optional)

Column names in data that should always be placed at the top or bottom of figures/tables.

translations

Localize your output

⁠list<character>⁠

A list of translations where the name is the code and the value is the translation. See the examples.

call

Internal call

⁠obj:<call>⁠ // Default: rlang::caller_env() (optional)

Both the absolute and relative folderpaths are required, as strings.

Value

Dataset with the columns: .variable_name, .variable_label, .category, .count, .count_se, .count_per_dep, .count_per_indep_group, .proportion, .proportion_se, .mean, .mean_se, .median, indep-variable(s), .data_label, .comb_categories, .sum_value, .variable_label_prefix


Summarize Data Based on Variable Types

Description

Internal helper function that determines the appropriate data summarization approach based on variable types and calls the corresponding function.

Usage

summarize_data_by_type(args, subset_data, dep_crwd, indep_crwd, ...)

Arguments

args

List of makeme function arguments

subset_data

Data frame subset for the current crowd

dep_crwd

Character vector of dependent variable names for current crowd

indep_crwd

Character vector of independent variable names for current crowd

...

Additional arguments passed to summarization functions

Value

Modified args list with data_summary element added:


Read tabular data from various formats

Description

A wrapper function to read data from different file formats

Usage

tabular_read(path, format, ...)

Arguments

path

Character string specifying the file path

format

Character string specifying the format: "delim", "xlsx", "csv", "csv2", "tsv", "sav", "dta"

...

Additional arguments passed to the underlying read functions

Value

A data frame containing the loaded data


Write tabular data to various formats

Description

A wrapper function to write data frames to different file formats

Usage

tabular_write(object, path, format)

Arguments

object

A data frame to write

path

Character string specifying the output file path

format

Character string specifying the format: "delim", "xlsx", "csv", "csv2", "tsv", "sav", "dta"

Value

Invisibly returns TRUE on success, used for side effects

Examples

data <- data.frame(x = 1:3, y = letters[1:3])

# Write as CSV
tabular_write(data, tempfile(fileext = ".csv"), format = "csv")

# Write as Excel
tabular_write(data, tempfile(fileext = ".xlsx"), format = "xlsx")

# Write as SPSS
tabular_write(data, tempfile(fileext = ".sav"), format = "sav")

Extract Text Summary from Categorical Mesos Plots

Description

Generates text summaries comparing two groups from categorical mesos plot data. The function identifies meaningful differences between groups based on proportions of respondents selecting specific categories and produces narrative text descriptions.

Usage

txt_from_cat_mesos_plots(
  plots,
  min_prop_diff = 0.1,
  n_highest_categories = 1,
  flip_to_lowest_categories = FALSE,
  digits = 2,
  selected_categories_last_split = " or ",
  fallback_string = character(),
  glue_str_pos =
    c(paste0("For {var}, the target group has a higher proportion of respondents ",
    "({group_1}) than all others ({group_2}) who answered {selected_categories}."),
    paste0("More respondents answered {selected_categories} for {var} in the ",
    "target group ({group_1}) than in other groups ({group_2})."),
    paste0("The statement {var} shows {selected_categories} responses are more ",
    "common in the target group ({group_1}) compared to others ({group_2}).")),
  glue_str_neg =
    c(paste0("For {var}, the target group has a lower proportion of respondents ",
    "({group_1}) than all others ({group_2}) who answered {selected_categories}."),
    paste0("Fewer respondents answered {selected_categories} for {var} in the ",
    "target group ({group_1}) than in other groups ({group_2})."),
    paste0("The statement {var} shows {selected_categories} responses are less ",
    "common in the target group ({group_1}) compared to others ({group_2})."))
)

Arguments

plots

A list of two plot objects (or data frames with plot data) to compare. Each must contain columns: .variable_label, .category, .category_order, .proportion.

min_prop_diff

Numeric. Minimum proportion difference (default 0.10) required between groups to generate text. Differences below this threshold are ignored.

n_highest_categories

Integer. Number of top categories to include in the comparison (default 1). Categories are selected based on .category_order.

flip_to_lowest_categories

Logical. If TRUE, compare lowest categories instead of highest (default FALSE).

digits

Integer. Number of decimal places for rounding proportions (default 2).

selected_categories_last_split

Character. Separator for the last item when listing multiple categories (default " or ").

fallback_string

Character. String to return when validation fails (default character()).

glue_str_pos

Character vector. Templates for positive differences (group_1 > group_2). Available placeholders: {var}, {group_1}, {group_2}, {selected_categories}.

glue_str_neg

Character vector. Templates for negative differences (group_2 > group_1). Same placeholders as glue_str_pos.

Details

The function compares proportions between two groups for each variable in the plot data. One template is randomly selected from the provided vectors for variety in output text.

Value

A character vector of text summaries, one per variable with meaningful differences. Returns empty character vector if no plots provided or no meaningful differences found.

Examples

## Not run: 
# Create sample plot data
plot_data_1 <- data.frame(
  .variable_label = rep("Job satisfaction", 3),
  .category = factor(c("Low", "Medium", "High"), levels = c("Low", "Medium", "High")),
  .category_order = 1:3,
  .proportion = c(0.2, 0.3, 0.5)
)

plot_data_2 <- data.frame(
  .variable_label = rep("Job satisfaction", 3),
  .category = factor(c("Low", "Medium", "High"), levels = c("Low", "Medium", "High")),
  .category_order = 1:3,
  .proportion = c(0.3, 0.4, 0.3)
)

plots <- list(
  list(data = plot_data_1),
  list(data = plot_data_2)
)

# Generate text summaries
txt_from_cat_mesos_plots(plots, min_prop_diff = 0.10)

# Compare lowest categories instead
txt_from_cat_mesos_plots(
  plots,
  flip_to_lowest_categories = TRUE,
  min_prop_diff = 0.05
)

## End(Not run)


Validate single dependent variable requirement

Description

Common validation pattern for functions that require exactly one dependent variable.

Usage

validate_single_dep_var(dep, function_name)

Arguments

dep

Vector of dependent variables

function_name

Name of the function requiring validation (for error message)

Value

Nothing if valid, throws error if invalid


Perform Type-Specific Validation Checks

Description

Internal helper function that validates arguments based on the specific output type requested. Different types have different constraints.

Usage

validate_type_specific_constraints(args, data, indep, dep_pos)

Arguments

args

List of makeme function arguments

data

Data frame being analyzed

indep

Character vector of independent variable names

dep_pos

Named integer vector of dependent variable positions

Details

Current type-specific validations:

Value

NULL (function used for side effects - validation errors)