This vignette is largely based on the PharmaSUG 2023 conference paper.
When using a maraca plot in a regulatory setting, for example to visualize clinical study results for regulatory submissions, there will be strict validation demands. For example, the results might need to be double programmed for validation purposes.
In order to facilitate the validation of the graphic output, the maraca package includes the function validate_maraca_plot()
that allows the user to extract important metrics from the plot itself. This allows to programmatically compare the results of a plot produced using the maraca package with other programmatic approaches. The paper linked above walks through a validation example where the double programming was done in SAS.
To use the validation functionality, we need to first create a maraca object.
library(maraca)
data(hce_scenario_a)
maraca_dat <- maraca(
data = hce_scenario_a,
step_outcomes = c("Outcome I", "Outcome II", "Outcome III", "Outcome IV"),
last_outcome = "Continuous outcome",
fixed_followup_days = 3 * 365,
column_names = c(outcome = "GROUP", arm = "TRTP", value = "AVAL0"),
arm_levels = c(active = "Active", control = "Control"),
compute_win_odds = TRUE
)
We then create a maraca plot and save the actual plot as an object.
# Save plot as its own object
maraca_plot <- plot(maraca_dat)
# The plot has its own class called "maracaPlot"
class(maraca_plot)
## [1] "maracaPlot" "gg" "ggplot"
# Display plot
maraca_plot
Now we can validate the plot using the validate_maraca_plot()
function.
validation_list <- validate_maraca_plot(maraca_plot)
# Display which metrics are included
str(validation_list)
## List of 9
## $ plot_type : chr "GeomViolin+GeomBoxplot"
## $ proportions : Named num [1:5] 13.8 18.1 12.7 9.8 45.6
## ..- attr(*, "names")= chr [1:5] "Outcome I" "Outcome II" "Outcome III" "Outcome IV" ...
## $ tte_data :'data.frame': 544 obs. of 3 variables:
## ..$ x : num [1:544] 0.125 0.159 0.463 0.591 0.621 ...
## ..$ y : num [1:544] 0.2 0.4 0.6 0.2 0.4 0.6 0.8 0.8 1 1.2 ...
## ..$ group: Factor w/ 2 levels "Active","Control": 2 2 2 1 1 1 2 1 2 2 ...
## $ binary_step_data: NULL
## $ binary_last_data: NULL
## $ scatter_data : NULL
## $ boxstat_data :'data.frame': 2 obs. of 9 variables:
## ..$ group : Factor w/ 2 levels "Active","Control": 1 2
## ..$ x_lowest : num [1:2] 60.8 54.4
## ..$ whisker_lower: num [1:2] 63.8 57.8
## ..$ hinge_lower : num [1:2] 74.9 71
## ..$ median : num [1:2] 79.7 75.9
## ..$ hinge_upper : num [1:2] 83.9 80.7
## ..$ whisker_upper: num [1:2] 95.5 94.5
## ..$ x_highest : num [1:2] 95.5 100
## ..$ outliers :List of 2
## .. ..$ : num 60.8
## .. ..$ : num [1:5] 54.4 96.4 97.2 99.7 100
## $ violin_data :'data.frame': 1024 obs. of 5 variables:
## ..$ group : Factor w/ 2 levels "Active","Control": 1 1 1 1 1 1 1 1 1 1 ...
## ..$ x : num [1:1024] 60.8 60.8 60.9 61 61 ...
## ..$ y : num [1:1024] 44 44 44 44 44 44 44 44 44 44 ...
## ..$ density: num [1:1024] 0.00118 0.00121 0.00124 0.00127 0.0013 ...
## ..$ width : num [1:1024] 18.7 18.7 18.7 18.7 18.7 ...
## $ wo_stats : Named num [1:4] 1.64 1.42 1.91 9.35e-12
## ..- attr(*, "names")= chr [1:4] "winodds" "lowerCI" "upperCI" "p_value"
Running the validate_maraca_plot()
function on a maraca plot object returns a list with the following items:
plot_type
: depending on which density_plot_type
was selected for the plot either GeomPoint
, GeomViolin
and/or GeomBoxplot
proportions
: the proportions of the HCE componentstte_data
: time-to-event data if part of the step outcomes has type tte, otherwise NULL
binary_step_data
: binary data if part of the step outcomes has type binary, otherwise NULL
binary_step_data
: if last endpoint was binary then contains the data for the minimum, maximum and middle point x values displayed in the ellipsis, otherwise NULL
scatter_data
: if last endpoint was continuous and plot was created with density_plot_type = "scatter"
then contains dataset that was plotted in scatter plot, otherwise NULL
boxstat_data
: if last endpoint was continuous and if plot was created with density_plot_type = "box"
or density_plot_type = "default"
then contains the boxplot statistics, otherwise NULL
violin_data
: if last endpoint was continuous and if plot was created with density_plot_type = "violin"
or density_plot_type = "default"
then contains the violin distribution data, otherwise NULL
wo_stats
: if maraca object was created with compute_win_odds = TRUE
then contains the win odds statistics, otherwise NULL
These can then be converted to a convenient format for validation, such as as individual data.frames.
library(dplyr)
library(tidyr)
validation_list$proportions %>%
as.data.frame() %>%
rename("proportion" = ".")
## proportion
## Outcome I 13.8
## Outcome II 18.1
## Outcome III 12.7
## Outcome IV 9.8
## Continuous outcome 45.6
head(validation_list$tte_data)
## x y group
## 3 0.1251495 0.2 Control
## 4 0.1593700 0.4 Control
## 5 0.4625571 0.6 Control
## 6 0.5905571 0.2 Active
## 7 0.6209704 0.4 Active
## 8 0.6478802 0.6 Active
validation_list$boxstat_data %>%
unnest_wider(outliers, names_sep = "") %>%
pivot_longer(., cols = -group, names_to = "stat_name", values_to = "values") %>%
filter(!is.na(values)) %>%
as.data.frame()
## group stat_name values
## 1 Active x_lowest 60.77490
## 2 Active whisker_lower 63.82323
## 3 Active hinge_lower 74.87904
## 4 Active median 79.68861
## 5 Active hinge_upper 83.87907
## 6 Active whisker_upper 95.53322
## 7 Active x_highest 95.53322
## 8 Active outliers1 60.77490
## 9 Control x_lowest 54.40000
## 10 Control whisker_lower 57.76833
## 11 Control hinge_lower 70.99354
## 12 Control median 75.90131
## 13 Control hinge_upper 80.73315
## 14 Control whisker_upper 94.45711
## 15 Control x_highest 100.00000
## 16 Control outliers1 54.40000
## 17 Control outliers2 96.39603
## 18 Control outliers3 97.24586
## 19 Control outliers4 99.68534
## 20 Control outliers5 100.00000
head(validation_list$violin_data)
## group x y density width
## 1 Active 60.77490 44 0.001176222 18.72
## 2 Active 60.84292 44 0.001206811 18.72
## 3 Active 60.91094 44 0.001238073 18.72
## 4 Active 60.97896 44 0.001270508 18.72
## 5 Active 61.04698 44 0.001303931 18.72
## 6 Active 61.11500 44 0.001338236 18.72
validation_list$wo_stats
## winodds lowerCI upperCI p_value
## 1.643265e+00 1.416117e+00 1.906848e+00 9.354073e-12