Substructure Database#
To make the label command possible, we created a database using data from PubChem and RDKit’s SMARTS-based substructure matching functionality.
Current Supported Functional Groups / Substructures
hydroxyl, carbonyl, carboxyl, amine, alcohol, ether, ketone, halide, ester, amide, aldehyde, aromatic, amino acid, nitrate, phenol, nitro, phosphoric acid, phosphoric ester, sulfate, sulfonate, thiol, carbothioester
Add another substructure to the database#
Clone our repo to your machine
Install RDKit
Navigate to the tofspec/db folder
Open the substructures.yml file
Add your substructure(s) and the corresponding SMARTS string to the list, and save changes
Run
build_db.py
It might take a few minutes, but database.feather will be updated with your substructure(s) of choice!