Using FLORAL with phyloseq data

Phyloseq is a popular package for working with microbiome data. Here we show how to use the phy_to_floral_data helper function to convert phyloseq data into a format accepted by FLORAL.

The following code downloads data described in this paper and turns it into a phyloseq object. The tax_glom step here takes some time, and can be replaced with speedyseq::tax_glom for better performace.

#this file has duplicate rows, and has multiple rows per pool
samples <- read.csv("https://figshare.com/ndownloader/files/33076496") %>% distinct() %>% 
  select(-Pool, -Run, -ShotgunBatchID) %>% distinct()
samples <- samples[1:100,] # Using the first 100 samples only.
counts <- read.csv("https://figshare.com/ndownloader/files/26393788") 
counts <- counts %>%
  filter(SampleID %in% samples$SampleID)
taxonomy <- read.csv("https://figshare.com/ndownloader/files/26770997")

phy <- phyloseq(
  sample_data(samples %>% column_to_rownames("SampleID")),
  tax_table(taxonomy %>% select(ASV, Kingdom:Genus) %>% column_to_rownames("ASV") %>% as.matrix()),
  otu_table(counts  %>% pivot_wider(names_from = "SampleID", values_from = "Count", values_fill = 0) %>% column_to_rownames("ASV") %>% as.matrix(), taxa_are_rows = TRUE)
) %>%  subset_samples(DayRelativeToNearestHCT > -30 & DayRelativeToNearestHCT < 0) %>% 
  tax_glom("Genus")

Next, we convert that phyloseq object into a list of results to be used by FLORAL; we have to specify the main outcome of interest as y, and any metadata columns (from sample_data(phy)) to use as covariates. Note that the analysis described here is just an example for using the function; this

dat <- FLORAL::phy_to_floral_data(phy, covariates=c("Consistency"), y = "DayRelativeToNearestHCT")

The resulting list has named entities for the main arguments to FLORAL:

res <- FLORAL::FLORAL(y = dat$y, x = dat$xcount, ncov = dat$ncov, family = "gaussian", ncv=NULL, progress=FALSE)