The primary function in volcalc
is
calc_vol()
. It accepts either a path to a .mol file or a
SMILES string. There are a few example .mol files included in the
volcalc
installation and their file paths are returned by
mol_example()
.
Basic usage with .mol files
#using built-in example .mol files
mol_paths <- mol_example()
mol_paths
#> [1] "/home/runner/work/_temp/Library/volcalc/extdata/C00031.mol"
#> [2] "/home/runner/work/_temp/Library/volcalc/extdata/C00157.mol"
#> [3] "/home/runner/work/_temp/Library/volcalc/extdata/C08491.mol"
#> [4] "/home/runner/work/_temp/Library/volcalc/extdata/C16181.mol"
#> [5] "/home/runner/work/_temp/Library/volcalc/extdata/C16286.mol"
#> [6] "/home/runner/work/_temp/Library/volcalc/extdata/C16521.mol"
The default output of calc_vol()
includes a relative
volatility index, rvi
which is equivalent to
(Meredith et al., 2023). It also includes a RVI category for clean
air.
calc_vol(mol_paths)
#> Warning in FUN(X[[i]], ...): Possible OpenBabel errors detected and only NAs returned.
#> Run with `validate = FALSE` to ignore this.
#> # A tibble: 6 × 5
#> mol_path formula name rvi category
#> <chr> <chr> <chr> <dbl> <fct>
#> 1 /home/runner/work/_temp/Library/volcalc/extdata/… C6H12O6 D-Gl… -2.81 non-vol…
#> 2 /home/runner/work/_temp/Library/volcalc/extdata/… NA Phos… NA NA
#> 3 /home/runner/work/_temp/Library/volcalc/extdata/… C12H18… (-)-… 1.84 moderate
#> 4 /home/runner/work/_temp/Library/volcalc/extdata/… C6H7Cl… beta… 6.98 high
#> 5 /home/runner/work/_temp/Library/volcalc/extdata/… C12H22O Geos… 4.16 high
#> 6 /home/runner/work/_temp/Library/volcalc/extdata/… C5H8 Isop… 8.84 high
Specify environment
Specifying environment
only alters the RVI category by
using different RVI cutoffs for non-volatile, low, moderate, and high
volatility. Environment options and their category cutoffs are in the
calc_vol()
documentation and are discussed in more detail
in Meredith et al. (2023) and Donahue et al. (2006).
calc_vol(mol_paths, environment = "soil")
#> Warning in FUN(X[[i]], ...): Possible OpenBabel errors detected and only NAs returned.
#> Run with `validate = FALSE` to ignore this.
#> # A tibble: 6 × 5
#> mol_path formula name rvi category
#> <chr> <chr> <chr> <dbl> <fct>
#> 1 /home/runner/work/_temp/Library/volcalc/extdata/… C6H12O6 D-Gl… -2.81 non-vol…
#> 2 /home/runner/work/_temp/Library/volcalc/extdata/… NA Phos… NA NA
#> 3 /home/runner/work/_temp/Library/volcalc/extdata/… C12H18… (-)-… 1.84 non-vol…
#> 4 /home/runner/work/_temp/Library/volcalc/extdata/… C6H7Cl… beta… 6.98 moderate
#> 5 /home/runner/work/_temp/Library/volcalc/extdata/… C12H22O Geos… 4.16 low
#> 6 /home/runner/work/_temp/Library/volcalc/extdata/… C5H8 Isop… 8.84 high
Return intermediate steps
calc_vol()
uses a modified version of the SIMPOL.1
method by default which is a group contribution method. You can have
calc_vol()
return the counts of functional groups and other
molecular properties (which is useful for validation) with
return_fx_groups = TRUE
. See ?get_fx_groups()
for more information about these additional columns.
calc_vol(mol_paths, return_fx_groups = TRUE)
#> Warning in FUN(X[[i]], ...): Possible OpenBabel errors detected and only NAs returned.
#> Run with `validate = FALSE` to ignore this.
#> # A tibble: 6 × 53
#> mol_path formula name rvi category exact_mass carbons carbons_asa
#> <chr> <chr> <chr> <dbl> <fct> <dbl> <int> <int>
#> 1 /home/runner/work… C6H12O6 D-Gl… -2.81 non-vol… 180. 6 0
#> 2 /home/runner/work… NA Phos… NA NA NA 0 0
#> 3 /home/runner/work… C12H18… (-)-… 1.84 moderate 210. 12 0
#> 4 /home/runner/work… C6H7Cl… beta… 6.98 high 270. 6 0
#> 5 /home/runner/work… C12H22O Geos… 4.16 high 182. 12 0
#> 6 /home/runner/work… C5H8 Isop… 8.84 high 68.1 5 0
#> # ℹ 45 more variables: rings_aromatic <int>, rings_total <int>,
#> # rings_aliphatic <int>, carbon_dbl_bonds_aliphatic <int>,
#> # CCCO_aliphatic_ring <int>, hydroxyl_total <int>, hydroxyl_aromatic <int>,
#> # hydroxyl_aliphatic <int>, aldehydes <int>, ketones <int>,
#> # carbox_acids <int>, ester <int>, ether_total <int>, ether_alkyl <int>,
#> # ether_alicyclic <int>, ether_aromatic <int>, nitrate <int>, nitro <int>,
#> # amine_primary <int>, amine_secondary <int>, amine_tertiary <int>, …
The SIMPOL.1 method calculates
,
which is used by calc_vol()
to calculate RVI as
where
is the estimated vapor pressure for the compound,
is molecular weight of the compound,
is the universal gas constant, and
is temperature (293.14K or 20ºC). To see these intermediate
calculations, use return_calc_steps = TRUE
.
calc_vol(mol_paths, return_calc_steps = TRUE)
#> Warning in FUN(X[[i]], ...): Possible OpenBabel errors detected and only NAs returned.
#> Run with `validate = FALSE` to ignore this.
#> # A tibble: 6 × 8
#> mol_path formula name rvi category molecular_weight log_alpha log10_P
#> <chr> <chr> <chr> <dbl> <fct> <dbl> <dbl> <dbl>
#> 1 /home/runner/… C6H12O6 D-Gl… -2.81 non-vol… 180. 9.87 -12.7
#> 2 /home/runner/… NA Phos… NA NA NA NA 1.79
#> 3 /home/runner/… C12H18… (-)-… 1.84 moderate 210. 9.94 -8.10
#> 4 /home/runner/… C6H7Cl… beta… 6.98 high 272. 10.1 -3.08
#> 5 /home/runner/… C12H22O Geos… 4.16 high 182. 9.88 -5.72
#> 6 /home/runner/… C5H8 Isop… 8.84 high 68.1 9.45 -0.61
log_alpha
=
Use with SMILES
All of this can be done using SMILES
strings rather than .mol files with from = "smiles"
.
Backslash, \
is a valid SMILES character, but isn’t a valid
character in R and must be “escaped” as \\
.
## This will error even though the SMILES is correct
# calc_vol("CC/C=C\C[C@@H]1[C@H](CCC1=O)CC(=O)O", from = "smiles")
# To solve this, escape \C as \\C
calc_vol("CC/C=C\\C[C@@H]1[C@H](CCC1=O)CC(=O)O", from = "smiles")
#> # A tibble: 1 × 5
#> smiles formula name rvi category
#> <chr> <chr> <chr> <dbl> <fct>
#> 1 "CC/C=C\\C[C@@H]1[C@H](CCC1=O)CC(=O)O" C12H18O3 NA 1.84 moderate
Validation
Occasionally, a .mol file will result in an error message bubbling up from the OpenBabel command line utility. For example, if there is an ‘R’ group somewhere in the molecule as is the case with Phosphatidylcholine on KEGG.
# phosphatidylcholine .mol file from KEGG
c00157 <- mol_example()[2]
calc_vol(c00157)
#> ==============================
#> *** Open Babel Warning in InChI code
#> Phosphatidylcholine :Unknown element(s): *
#> ==============================
#> *** Open Babel Error in InChI code
#> InChI generation failed
#> # A tibble: 1 × 5
#> mol_path formula name rvi category
#> <chr> <chr> <chr> <dbl> <fct>
#> 1 /Users/ericscott/Documents/GitHub/volcalc/inst/extdata/C00157.mol NA Phosphatidylcholine NA NA
#> Warning message:
#> In FUN(X[[i]], ...) :
#> Possible OpenBabel errors detected and only NAs returned.
#> Run with `validate = FALSE` to ignore this.
Without validation, it will return an incorrect value for
rvi
and category
for this compound.
calc_vol(c00157, validate = FALSE)
#> ==============================
#> *** Open Babel Warning in InChI code
#> Phosphatidylcholine :Unknown element(s): *
#> ==============================
#> *** Open Babel Error in InChI code
#> InChI generation failed
#> # A tibble: 1 × 5
#> mol_path formula name rvi category
#> <chr> <chr> <chr> <dbl> <fct>
#> 1 /Users/ericscott/Documents/GitHub/volcalc/inst/extdata/C00157.mol C10H18NO8P Phosphatidylcholi… 2.89 high
Phosphatidylcholine is a large phospholipid and is not highly volatile as these results would suggest.
Details
Unfortunately, it is nearly impossible to detect these parsing errors
from OpenBabel directly in R. When validate = TRUE
is set
(which it is by default), calc_vol()
will look for
“symptoms” of OpenBabel errors and return NA
s for all
values. Namely, validation works by assuming that InChI generation will
fail whenever there are OpenBabel parsing issues. Because InChI
generation is not available on the Windows version of OpenBabel
installed with ChemmineOB
, this volcalc
feature is only available on macOS and Linux. Setting
validate = TRUE
on Windows will have no effect.
References
Donahue, N.M., Robinson, A.L., Stanier, C.O., Pandis, S.N., 2006. Coupled Partitioning, Dilution, and Chemical Aging of Semivolatile Organics. Environ. Sci. Technol. 40, 2635–2643. DOI: 10.1021/es052297c
Meredith L, Ledford S, Riemer K, Geffre P, Graves K, Honeker L, LeBauer D, Tfaily M, Krechmer J, 2023. Automating methods for estimating metabolite volatility. Frontiers in Microbiology. DOI: 10.3389/fmicb.2023.1267234