| Type: | Package |
| Title: | Root Exudate Feature Toolkit |
| Version: | 0.1.4 |
| Description: | Provides tools for molecule-oriented and reaction-centred analysis of root exudate datasets. It supports structural matching based on 'PubChem', calculation of molecular descriptors, and inference of candidate microbe-associated metabolic reactions using Kyoto Encyclopedia of Genes and Genomes ('KEGG') identifiers and Enzyme Commission ('EC') numbers. For background on these databases, see Kanehisa et al. (2023) <doi:10.1093/nar/gkac963> and Kim et al. (2023) <doi:10.1093/nar/gkac956>. |
| License: | MIT + file LICENSE |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.2 |
| Imports: | readxl, dplyr, purrr, stringr, tibble, writexl, webchem, rlang |
| Suggests: | rcdk, rcdklibs |
| Depends: | R (≥ 4.1.0) |
| URL: | https://github.com/gaoguozhen1/REFT |
| BugReports: | https://github.com/gaoguozhen1/REFT/issues |
| NeedsCompilation: | no |
| Packaged: | 2026-05-15 06:22:29 UTC; Administrator |
| Author: | Guozhen Gao [aut, cre] |
| Maintainer: | Guozhen Gao <gaoguozhen889@gmail.com> |
| Repository: | CRAN |
| Date/Publication: | 2026-05-19 09:30:24 UTC |
REFT: Root Exudate Feature Toolkit
Description
REFT is an R package for batch PubChem matching and molecular descriptor
calculation from root exudate or metabolomics annotation tables.
Calculate six molecular descriptors
Description
Calculate six descriptors from a character vector of SMILES using rcdk.
Usage
reft_calc_descriptors(smiles)
Arguments
smiles |
A character vector of SMILES. |
Value
A tibble with six molecular descriptors.
Examples
if (requireNamespace("rcdk", quietly = TRUE)) {
reft_calc_descriptors("OC(=O)CCC(=O)O")
}
Run KEGG microbe-EC-reaction search workflow
Description
Import a microbial EC annotation table, normalize EC identifiers, extract
species names from taxonomy strings, query KEGG for EC-linked reactions, and
append reactants, products, and compound formulae. By default, no files are
written; set output_dir to explicitly request Excel outputs.
Usage
reft_kegg_microbe_run(
input_file,
ec_col = "EC_Number",
taxonomy_col = "Taxonomy",
output_dir = NULL,
output_file = "microbe_ec_kegg_reactions.xlsx",
sleep_sec = 0.35,
verbose = TRUE
)
Arguments
input_file |
Path to input annotation table. |
ec_col |
Column containing EC numbers. Default is |
taxonomy_col |
Column containing taxonomy strings. Default is |
output_dir |
Output directory. If |
output_file |
Output Excel filename. Default is
|
sleep_sec |
Delay between KEGG requests in seconds. Default is |
verbose |
Whether to print progress. Default is |
Value
A named list containing:
- results
Full result table with EC, microbe, reaction, compounds, and formulae.
- ec_to_reaction
EC-to-reaction mapping table.
- reaction_details
Reaction detail table.
- compound_table
Compound formula table.
Examples
toy <- data.frame(
EC_Number = "1.1.1.1",
Taxonomy = "k__Bacteria;p__Proteobacteria;g__Escherichia;s__Escherichia_coli"
)
input_file <- tempfile(fileext = ".csv")
utils::write.csv(toy, input_file, row.names = FALSE)
res <- try(
reft_kegg_microbe_run(input_file, output_dir = tempdir(), sleep_sec = 0,
verbose = FALSE),
silent = TRUE
)
if (!inherits(res, "try-error")) head(res$results)
Match SMILES from PubChem
Description
Batch match SMILES using Name, Other Name, KEGG ID, and HMDB ID in order.
Usage
reft_match_smiles(
data,
name_col = "Name",
other_col = "Other_name(Kegg_name)",
hmdb_col = "HMDB_ID",
kegg_col = "Kegg_ID"
)
Arguments
data |
A data frame containing query columns. |
name_col |
Compound name column. |
other_col |
Alternative name column. |
hmdb_col |
HMDB ID column. |
kegg_col |
KEGG ID column. |
Value
A data frame with matching log and SMILES.
Examples
dat <- data.frame(
Name = "Glutarate",
`Other_name(Kegg_name)` = NA,
HMDB_ID = NA,
Kegg_ID = NA,
check.names = FALSE
)
res <- try(reft_match_smiles(dat), silent = TRUE)
if (!inherits(res, "try-error")) head(res)
Run the full REFT workflow
Description
Import an Excel table, clean query fields, match SMILES from PubChem,
calculate six molecular descriptors, and optionally write Excel outputs.
By default, no files are written; set output_dir to explicitly request
Excel outputs.
Usage
reft_run(
input_file,
name_col = "Name",
other_col = "Other_name(Kegg_name)",
hmdb_col = "HMDB_ID",
kegg_col = "Kegg_ID",
output_dir = NULL,
output_desc_file = "metabolites_6_descriptors.xlsx",
output_unmatched_file = "unmatched_smiles.xlsx",
output_log_file = "pubchem_match_log.xlsx",
verbose = TRUE
)
Arguments
input_file |
Path to the input Excel file. |
name_col |
Column name for compound name. Default is |
other_col |
Column name for alternative name. Default is
|
hmdb_col |
Column name for HMDB identifier. Default is |
kegg_col |
Column name for KEGG identifier. Default is |
output_dir |
Output directory. If |
output_desc_file |
Final descriptor Excel filename. |
output_unmatched_file |
Unmatched records Excel filename. |
output_log_file |
PubChem match log Excel filename. |
verbose |
Whether to print progress. Default is |
Value
A named list with three data frames:
- descriptors
Final annotation table with SMILES and six descriptors.
- unmatched
Rows that could not be matched to SMILES.
- match_log
Unique-query matching log from PubChem.
Examples
toy <- data.frame(
Name = "Glutarate",
`Other_name(Kegg_name)` = NA,
HMDB_ID = NA,
Kegg_ID = NA,
check.names = FALSE
)
if (requireNamespace("rcdk", quietly = TRUE)) {
input_file <- tempfile(fileext = ".xlsx")
writexl::write_xlsx(toy, input_file)
res <- try(reft_run(input_file, output_dir = tempdir(), verbose = FALSE),
silent = TRUE)
if (!inherits(res, "try-error")) head(res$descriptors)
}
Run REFT with default column names
Description
A simplified wrapper around reft_run() for the common case where the input
file already uses the default column names. By default, no files are written;
set output_dir to explicitly request Excel outputs.
Usage
reft_run_simple(input_file, output_dir = NULL, verbose = TRUE)
Arguments
input_file |
Path to the input Excel file. |
output_dir |
Output directory. If |
verbose |
Whether to print progress. Default is |
Value
Same as reft_run().
Examples
toy <- data.frame(
Name = "Glutarate",
`Other_name(Kegg_name)` = NA,
HMDB_ID = NA,
Kegg_ID = NA,
check.names = FALSE
)
if (requireNamespace("rcdk", quietly = TRUE)) {
input_file <- tempfile(fileext = ".xlsx")
writexl::write_xlsx(toy, input_file)
res <- try(reft_run_simple(input_file, output_dir = tempdir(), verbose = FALSE),
silent = TRUE)
if (!inherits(res, "try-error")) head(res$descriptors)
}