scientific-skills/rdkit/references/api_reference.md
This document provides a comprehensive reference for RDKit's Python API, organized by functionality.
The fundamental module for working with molecules.
Reading Molecules:
Chem.MolFromSmiles(smiles, sanitize=True) - Parse SMILES stringChem.MolFromSmarts(smarts) - Parse SMARTS patternChem.MolFromMolFile(filename, sanitize=True, removeHs=True) - Read MOL fileChem.MolFromMolBlock(molblock, sanitize=True, removeHs=True) - Parse MOL block stringChem.MolFromMol2File(filename, sanitize=True, removeHs=True) - Read MOL2 fileChem.MolFromMol2Block(molblock, sanitize=True, removeHs=True) - Parse MOL2 blockChem.MolFromPDBFile(filename, sanitize=True, removeHs=True) - Read PDB fileChem.MolFromPDBBlock(pdbblock, sanitize=True, removeHs=True) - Parse PDB blockChem.MolFromInchi(inchi, sanitize=True, removeHs=True) - Parse InChI stringChem.MolFromSequence(seq, sanitize=True) - Create molecule from peptide sequenceWriting Molecules:
Chem.MolToSmiles(mol, isomericSmiles=True, canonical=True) - Convert to SMILESChem.MolToSmarts(mol, isomericSmarts=False) - Convert to SMARTSChem.MolToMolBlock(mol, includeStereo=True, confId=-1) - Convert to MOL blockChem.MolToMolFile(mol, filename, includeStereo=True, confId=-1) - Write MOL fileChem.MolToPDBBlock(mol, confId=-1) - Convert to PDB blockChem.MolToPDBFile(mol, filename, confId=-1) - Write PDB fileChem.MolToInchi(mol, options='') - Convert to InChIChem.MolToInchiKey(mol, options='') - Generate InChI keyChem.MolToSequence(mol) - Convert to peptide sequenceBatch I/O:
Chem.SDMolSupplier(filename, sanitize=True, removeHs=True) - SDF file readerChem.ForwardSDMolSupplier(fileobj, sanitize=True, removeHs=True) - Forward-only SDF readerChem.MultithreadedSDMolSupplier(filename, numWriterThreads=1) - Parallel SDF readerChem.SmilesMolSupplier(filename, delimiter=' ', titleLine=True) - SMILES file readerChem.SDWriter(filename) - SDF file writerChem.SmilesWriter(filename, delimiter=' ', includeHeader=True) - SMILES file writerSanitization:
Chem.SanitizeMol(mol, sanitizeOps=SANITIZE_ALL, catchErrors=False) - Sanitize moleculeChem.DetectChemistryProblems(mol, sanitizeOps=SANITIZE_ALL) - Detect sanitization issuesChem.AssignStereochemistry(mol, cleanIt=True, force=False) - Assign stereochemistryChem.FindPotentialStereo(mol) - Find potential stereocentersChem.AssignStereochemistryFrom3D(mol, confId=-1) - Assign stereo from 3D coordsHydrogen Management:
Chem.AddHs(mol, explicitOnly=False, addCoords=False) - Add explicit hydrogensChem.RemoveHs(mol, implicitOnly=False, updateExplicitCount=False) - Remove hydrogensChem.RemoveAllHs(mol) - Remove all hydrogensAromaticity:
Chem.SetAromaticity(mol, model=AROMATICITY_RDKIT) - Set aromaticity modelChem.Kekulize(mol, clearAromaticFlags=False) - Kekulize aromatic bondsChem.SetConjugation(mol) - Set conjugation flagsFragments:
Chem.GetMolFrags(mol, asMols=False, sanitizeFrags=True) - Get disconnected fragmentsChem.FragmentOnBonds(mol, bondIndices, addDummies=True) - Fragment on specific bondsChem.ReplaceSubstructs(mol, query, replacement, replaceAll=False) - Replace substructuresChem.DeleteSubstructs(mol, query, onlyFrags=False) - Delete substructuresStereochemistry:
Chem.FindMolChiralCenters(mol, includeUnassigned=False, useLegacyImplementation=False) - Find chiral centersChem.FindPotentialStereo(mol, cleanIt=True) - Find potential stereocentersBasic Matching:
mol.HasSubstructMatch(query, useChirality=False) - Check for substructure matchmol.GetSubstructMatch(query, useChirality=False) - Get first matchmol.GetSubstructMatches(query, uniquify=True, useChirality=False) - Get all matchesmol.GetSubstructMatches(query, maxMatches=1000) - Limit number of matchesAtom Methods:
atom.GetSymbol() - Atomic symbolatom.GetAtomicNum() - Atomic numberatom.GetDegree() - Number of bondsatom.GetTotalDegree() - Including hydrogensatom.GetFormalCharge() - Formal chargeatom.GetNumRadicalElectrons() - Radical electronsatom.GetIsAromatic() - Aromaticity flagatom.GetHybridization() - Hybridization (SP, SP2, SP3, etc.)atom.GetIdx() - Atom indexatom.IsInRing() - In any ringatom.IsInRingSize(size) - In ring of specific sizeatom.GetChiralTag() - Chirality tagBond Methods:
bond.GetBondType() - Bond type (SINGLE, DOUBLE, TRIPLE, AROMATIC)bond.GetBeginAtomIdx() - Starting atom indexbond.GetEndAtomIdx() - Ending atom indexbond.GetIsConjugated() - Conjugation flagbond.GetIsAromatic() - Aromaticity flagbond.IsInRing() - In any ringbond.GetStereo() - Stereochemistry (STEREONONE, STEREOZ, STEREOE, etc.)Molecule Methods:
mol.GetNumAtoms(onlyExplicit=True) - Number of atomsmol.GetNumHeavyAtoms() - Number of heavy atomsmol.GetNumBonds() - Number of bondsmol.GetAtoms() - Iterator over atomsmol.GetBonds() - Iterator over bondsmol.GetAtomWithIdx(idx) - Get specific atommol.GetBondWithIdx(idx) - Get specific bondmol.GetRingInfo() - Ring information objectRing Information:
Chem.GetSymmSSSR(mol) - Get smallest set of smallest ringsChem.GetSSSR(mol) - Alias for GetSymmSSSRring_info.NumRings() - Number of ringsring_info.AtomRings() - Tuples of atom indices in ringsring_info.BondRings() - Tuples of bond indices in ringsExtended chemistry functionality.
AllChem.Compute2DCoords(mol, canonOrient=True, clearConfs=True) - Generate 2D coordinatesAllChem.EmbedMolecule(mol, maxAttempts=0, randomSeed=-1, useRandomCoords=False) - Generate 3D conformerAllChem.EmbedMultipleConfs(mol, numConfs=10, maxAttempts=0, randomSeed=-1) - Generate multiple conformersAllChem.ConstrainedEmbed(mol, core, useTethers=True) - Constrained embeddingAllChem.GenerateDepictionMatching2DStructure(mol, reference, refPattern=None) - Align to templateAllChem.UFFOptimizeMolecule(mol, maxIters=200, confId=-1) - UFF optimizationAllChem.MMFFOptimizeMolecule(mol, maxIters=200, confId=-1, mmffVariant='MMFF94') - MMFF optimizationAllChem.UFFGetMoleculeForceField(mol, confId=-1) - Get UFF force field objectAllChem.MMFFGetMoleculeForceField(mol, pyMMFFMolProperties, confId=-1) - Get MMFF force fieldAllChem.GetConformerRMS(mol, confId1, confId2, prealigned=False) - Calculate RMSDAllChem.GetConformerRMSMatrix(mol, prealigned=False) - RMSD matrixAllChem.AlignMol(prbMol, refMol, prbCid=-1, refCid=-1) - Align moleculesAllChem.AlignMolConformers(mol) - Align all conformersAllChem.ReactionFromSmarts(smarts, useSmiles=False) - Create reaction from SMARTSreaction.RunReactants(reactants) - Apply reactionreaction.RunReactant(reactant, reactionIdx) - Apply to specific reactantAllChem.CreateDifferenceFingerprintForReaction(reaction) - Reaction fingerprintAllChem.GetMorganFingerprint(mol, radius, useFeatures=False) - Morgan fingerprintAllChem.GetMorganFingerprintAsBitVect(mol, radius, nBits=2048) - Morgan bit vectorAllChem.GetHashedMorganFingerprint(mol, radius, nBits=2048) - Hashed MorganAllChem.GetErGFingerprint(mol) - ErG fingerprintMolecular descriptor calculations.
Descriptors.MolWt(mol) - Molecular weightDescriptors.ExactMolWt(mol) - Exact molecular weightDescriptors.HeavyAtomMolWt(mol) - Heavy atom molecular weightDescriptors.MolLogP(mol) - LogP (lipophilicity)Descriptors.MolMR(mol) - Molar refractivityDescriptors.TPSA(mol) - Topological polar surface areaDescriptors.NumHDonors(mol) - Hydrogen bond donorsDescriptors.NumHAcceptors(mol) - Hydrogen bond acceptorsDescriptors.NumRotatableBonds(mol) - Rotatable bondsDescriptors.NumAromaticRings(mol) - Aromatic ringsDescriptors.NumSaturatedRings(mol) - Saturated ringsDescriptors.NumAliphaticRings(mol) - Aliphatic ringsDescriptors.NumAromaticHeterocycles(mol) - Aromatic heterocyclesDescriptors.NumRadicalElectrons(mol) - Radical electronsDescriptors.NumValenceElectrons(mol) - Valence electronsDescriptors.CalcMolDescriptors(mol) - Calculate all descriptors as dictionaryDescriptors._descList - List of (name, function) tuples for all descriptorsMolecular visualization.
Draw.MolToImage(mol, size=(300,300), kekulize=True, wedgeBonds=True, highlightAtoms=None) - Generate PIL imageDraw.MolToFile(mol, filename, size=(300,300), kekulize=True, wedgeBonds=True) - Save to fileDraw.MolsToGridImage(mols, molsPerRow=3, subImgSize=(200,200), legends=None) - Grid of moleculesDraw.MolsMatrixToGridImage(mols, molsPerRow=3, subImgSize=(200,200), legends=None) - Nested gridDraw.ReactionToImage(rxn, subImgSize=(200,200)) - Reaction imageDraw.DrawMorganBit(mol, bitId, bitInfo, whichExample=0) - Visualize Morgan bitDraw.DrawMorganBits(bits, mol, bitInfo, molsPerRow=3) - Multiple Morgan bitsDraw.DrawRDKitBit(mol, bitId, bitInfo, whichExample=0) - Visualize RDKit bitDraw.IPythonConsole - Module for Jupyter integrationDraw.IPythonConsole.ipython_useSVG - Use SVG (True) or PNG (False)Draw.IPythonConsole.molSize - Default molecule image sizerdMolDraw2D.MolDrawOptions() - Get drawing options object
.addAtomIndices - Show atom indices.addBondIndices - Show bond indices.addStereoAnnotation - Show stereochemistry.bondLineWidth - Line width.highlightBondWidthMultiplier - Highlight width.minFontSize - Minimum font size.maxFontSize - Maximum font sizeAdditional descriptor calculations.
rdMolDescriptors.CalcNumRings(mol) - Number of ringsrdMolDescriptors.CalcNumAromaticRings(mol) - Aromatic ringsrdMolDescriptors.CalcNumAliphaticRings(mol) - Aliphatic ringsrdMolDescriptors.CalcNumSaturatedRings(mol) - Saturated ringsrdMolDescriptors.CalcNumHeterocycles(mol) - HeterocyclesrdMolDescriptors.CalcNumAromaticHeterocycles(mol) - Aromatic heterocyclesrdMolDescriptors.CalcNumSpiroAtoms(mol) - Spiro atomsrdMolDescriptors.CalcNumBridgeheadAtoms(mol) - Bridgehead atomsrdMolDescriptors.CalcFractionCsp3(mol) - Fraction of sp3 carbonsrdMolDescriptors.CalcLabuteASA(mol) - Labute accessible surface areardMolDescriptors.CalcTPSA(mol) - TPSArdMolDescriptors.CalcMolFormula(mol) - Molecular formulaScaffold analysis.
MurckoScaffold.GetScaffoldForMol(mol) - Get Murcko scaffoldMurckoScaffold.MakeScaffoldGeneric(mol) - Generic scaffoldMurckoScaffold.MurckoDecompose(mol) - Decompose to scaffold and sidechainsMolecular hashing and standardization.
rdMolHash.MolHash(mol, hashFunction) - Generate hash
rdMolHash.HashFunction.AnonymousGraph - Anonymized structurerdMolHash.HashFunction.CanonicalSmiles - Canonical SMILESrdMolHash.HashFunction.ElementGraph - Element graphrdMolHash.HashFunction.MurckoScaffold - Murcko scaffoldrdMolHash.HashFunction.Regioisomer - Regioisomer (no stereo)rdMolHash.HashFunction.NetCharge - Net chargerdMolHash.HashFunction.HetAtomProtomer - Heteroatom protomerrdMolHash.HashFunction.HetAtomTautomer - Heteroatom tautomerMolecule standardization.
rdMolStandardize.Normalize(mol) - Normalize functional groupsrdMolStandardize.Reionize(mol) - Fix ionization staterdMolStandardize.RemoveFragments(mol) - Remove small fragmentsrdMolStandardize.Cleanup(mol) - Full cleanup (normalize + reionize + remove)rdMolStandardize.Uncharger() - Create uncharger object
.uncharge(mol) - Remove chargesrdMolStandardize.TautomerEnumerator() - Enumerate tautomers
.Enumerate(mol) - Generate tautomers.Canonicalize(mol) - Get canonical tautomerFingerprint similarity and operations.
DataStructs.TanimotoSimilarity(fp1, fp2) - Tanimoto coefficientDataStructs.DiceSimilarity(fp1, fp2) - Dice coefficientDataStructs.CosineSimilarity(fp1, fp2) - Cosine similarityDataStructs.SokalSimilarity(fp1, fp2) - Sokal similarityDataStructs.KulczynskiSimilarity(fp1, fp2) - Kulczynski similarityDataStructs.McConnaugheySimilarity(fp1, fp2) - McConnaughey similarityDataStructs.BulkTanimotoSimilarity(fp, fps) - Tanimoto for list of fingerprintsDataStructs.BulkDiceSimilarity(fp, fps) - Dice for listDataStructs.BulkCosineSimilarity(fp, fps) - Cosine for listDataStructs.TanimotoDistance(fp1, fp2) - 1 - TanimotoDataStructs.DiceDistance(fp1, fp2) - 1 - DiceAtom pair fingerprints.
Pairs.GetAtomPairFingerprint(mol, minLength=1, maxLength=30) - Atom pair fingerprintPairs.GetAtomPairFingerprintAsBitVect(mol, minLength=1, maxLength=30, nBits=2048) - As bit vectorPairs.GetHashedAtomPairFingerprint(mol, nBits=2048, minLength=1, maxLength=30) - Hashed versionTopological torsion fingerprints.
Torsions.GetTopologicalTorsionFingerprint(mol, targetSize=4) - Torsion fingerprintTorsions.GetTopologicalTorsionFingerprintAsIntVect(mol, targetSize=4) - As int vectorTorsions.GetHashedTopologicalTorsionFingerprint(mol, nBits=2048, targetSize=4) - Hashed versionMACCS structural keys.
MACCSkeys.GenMACCSKeys(mol) - Generate 166-bit MACCS keysPharmacophore features.
ChemicalFeatures.BuildFeatureFactory(featureFile) - Create feature factoryfactory.GetFeaturesForMol(mol) - Get pharmacophore featuresfeature.GetFamily() - Feature family (Donor, Acceptor, etc.)feature.GetType() - Feature typefeature.GetAtomIds() - Atoms involved in featureClustering algorithms.
Butina.ClusterData(distances, nPts, distThresh, isDistData=True) - Butina clustering
Modern fingerprint generation API (RDKit 2020.09+).
rdFingerprintGenerator.GetMorganGenerator(radius=2, fpSize=2048) - Morgan generatorrdFingerprintGenerator.GetRDKitFPGenerator(minPath=1, maxPath=7, fpSize=2048) - RDKit FP generatorrdFingerprintGenerator.GetAtomPairGenerator(minDistance=1, maxDistance=30) - Atom pair generatorgenerator.GetFingerprint(mol) - Generate fingerprintgenerator.GetCountFingerprint(mol) - Count-based fingerprintSANITIZE_NONE - No sanitizationSANITIZE_ALL - All operations (default)SANITIZE_CLEANUP - Basic cleanupSANITIZE_PROPERTIES - Calculate propertiesSANITIZE_SYMMRINGS - Symmetrize ringsSANITIZE_KEKULIZE - Kekulize aromatic ringsSANITIZE_FINDRADICALS - Find radical electronsSANITIZE_SETAROMATICITY - Set aromaticitySANITIZE_SETCONJUGATION - Set conjugationSANITIZE_SETHYBRIDIZATION - Set hybridizationSANITIZE_CLEANUPCHIRALITY - Cleanup chiralityBondType.SINGLE - Single bondBondType.DOUBLE - Double bondBondType.TRIPLE - Triple bondBondType.AROMATIC - Aromatic bondBondType.DATIVE - Dative bondBondType.UNSPECIFIED - UnspecifiedHybridizationType.S - SHybridizationType.SP - SPHybridizationType.SP2 - SP2HybridizationType.SP3 - SP3HybridizationType.SP3D - SP3DHybridizationType.SP3D2 - SP3D2ChiralType.CHI_UNSPECIFIED - UnspecifiedChiralType.CHI_TETRAHEDRAL_CW - ClockwiseChiralType.CHI_TETRAHEDRAL_CCW - Counter-clockwise# Using conda (recommended)
conda install -c conda-forge rdkit
# Using pip
pip install rdkit-pypi
# Core functionality
from rdkit import Chem
from rdkit.Chem import AllChem
# Descriptors
from rdkit.Chem import Descriptors
# Drawing
from rdkit.Chem import Draw
# Similarity
from rdkit import DataStructs