Back to Claude Scientific Skills

RDKit Molecular Descriptors Reference

scientific-skills/rdkit/references/descriptors_reference.md

2.38.011.5 KB
Original Source

RDKit Molecular Descriptors Reference

Complete reference for molecular descriptors available in RDKit's Descriptors module.

Usage

python
from rdkit import Chem
from rdkit.Chem import Descriptors

mol = Chem.MolFromSmiles('CCO')

# Calculate individual descriptor
mw = Descriptors.MolWt(mol)

# Calculate all descriptors at once
all_desc = Descriptors.CalcMolDescriptors(mol)

Molecular Weight and Mass

MolWt

Average molecular weight of the molecule.

python
Descriptors.MolWt(mol)

ExactMolWt

Exact molecular weight using isotopic composition.

python
Descriptors.ExactMolWt(mol)

HeavyAtomMolWt

Average molecular weight ignoring hydrogens.

python
Descriptors.HeavyAtomMolWt(mol)

Lipophilicity

MolLogP

Wildman-Crippen LogP (octanol-water partition coefficient).

python
Descriptors.MolLogP(mol)

MolMR

Wildman-Crippen molar refractivity.

python
Descriptors.MolMR(mol)

Polar Surface Area

TPSA

Topological polar surface area (TPSA) based on fragment contributions.

python
Descriptors.TPSA(mol)

LabuteASA

Labute's Approximate Surface Area (ASA).

python
Descriptors.LabuteASA(mol)

Hydrogen Bonding

NumHDonors

Number of hydrogen bond donors (N-H and O-H).

python
Descriptors.NumHDonors(mol)

NumHAcceptors

Number of hydrogen bond acceptors (N and O).

python
Descriptors.NumHAcceptors(mol)

NOCount

Number of N and O atoms.

python
Descriptors.NOCount(mol)

NHOHCount

Number of N-H and O-H bonds.

python
Descriptors.NHOHCount(mol)

Atom Counts

HeavyAtomCount

Number of heavy atoms (non-hydrogen).

python
Descriptors.HeavyAtomCount(mol)

NumHeteroatoms

Number of heteroatoms (non-C and non-H).

python
Descriptors.NumHeteroatoms(mol)

NumValenceElectrons

Total number of valence electrons.

python
Descriptors.NumValenceElectrons(mol)

NumRadicalElectrons

Number of radical electrons.

python
Descriptors.NumRadicalElectrons(mol)

Ring Descriptors

RingCount

Number of rings.

python
Descriptors.RingCount(mol)

NumAromaticRings

Number of aromatic rings.

python
Descriptors.NumAromaticRings(mol)

NumSaturatedRings

Number of saturated rings.

python
Descriptors.NumSaturatedRings(mol)

NumAliphaticRings

Number of aliphatic (non-aromatic) rings.

python
Descriptors.NumAliphaticRings(mol)

NumAromaticCarbocycles

Number of aromatic carbocycles (rings with only carbons).

python
Descriptors.NumAromaticCarbocycles(mol)

NumAromaticHeterocycles

Number of aromatic heterocycles (rings with heteroatoms).

python
Descriptors.NumAromaticHeterocycles(mol)

NumSaturatedCarbocycles

Number of saturated carbocycles.

python
Descriptors.NumSaturatedCarbocycles(mol)

NumSaturatedHeterocycles

Number of saturated heterocycles.

python
Descriptors.NumSaturatedHeterocycles(mol)

NumAliphaticCarbocycles

Number of aliphatic carbocycles.

python
Descriptors.NumAliphaticCarbocycles(mol)

NumAliphaticHeterocycles

Number of aliphatic heterocycles.

python
Descriptors.NumAliphaticHeterocycles(mol)

Rotatable Bonds

NumRotatableBonds

Number of rotatable bonds (flexibility).

python
Descriptors.NumRotatableBonds(mol)

Aromatic Atoms

NumAromaticAtoms

Number of aromatic atoms.

python
Descriptors.NumAromaticAtoms(mol)

Fraction Descriptors

FractionCsp3

Fraction of carbons that are sp3 hybridized.

python
Descriptors.FractionCsp3(mol)

Complexity Descriptors

BertzCT

Bertz complexity index.

python
Descriptors.BertzCT(mol)

Ipc

Information content (complexity measure).

python
Descriptors.Ipc(mol)

Kappa Shape Indices

Molecular shape descriptors based on graph invariants.

Kappa1

First kappa shape index.

python
Descriptors.Kappa1(mol)

Kappa2

Second kappa shape index.

python
Descriptors.Kappa2(mol)

Kappa3

Third kappa shape index.

python
Descriptors.Kappa3(mol)

Chi Connectivity Indices

Molecular connectivity indices.

Chi0, Chi1, Chi2, Chi3, Chi4

Simple chi connectivity indices.

python
Descriptors.Chi0(mol)
Descriptors.Chi1(mol)
Descriptors.Chi2(mol)
Descriptors.Chi3(mol)
Descriptors.Chi4(mol)

Chi0n, Chi1n, Chi2n, Chi3n, Chi4n

Valence-modified chi connectivity indices.

python
Descriptors.Chi0n(mol)
Descriptors.Chi1n(mol)
Descriptors.Chi2n(mol)
Descriptors.Chi3n(mol)
Descriptors.Chi4n(mol)

Chi0v, Chi1v, Chi2v, Chi3v, Chi4v

Valence chi connectivity indices.

python
Descriptors.Chi0v(mol)
Descriptors.Chi1v(mol)
Descriptors.Chi2v(mol)
Descriptors.Chi3v(mol)
Descriptors.Chi4v(mol)

Hall-Kier Alpha

HallKierAlpha

Hall-Kier alpha value (molecular flexibility).

python
Descriptors.HallKierAlpha(mol)

Balaban's J Index

BalabanJ

Balaban's J index (branching descriptor).

python
Descriptors.BalabanJ(mol)

EState Indices

Electrotopological state indices.

MaxEStateIndex

Maximum E-state value.

python
Descriptors.MaxEStateIndex(mol)

MinEStateIndex

Minimum E-state value.

python
Descriptors.MinEStateIndex(mol)

MaxAbsEStateIndex

Maximum absolute E-state value.

python
Descriptors.MaxAbsEStateIndex(mol)

MinAbsEStateIndex

Minimum absolute E-state value.

python
Descriptors.MinAbsEStateIndex(mol)

Partial Charges

MaxPartialCharge

Maximum partial charge.

python
Descriptors.MaxPartialCharge(mol)

MinPartialCharge

Minimum partial charge.

python
Descriptors.MinPartialCharge(mol)

MaxAbsPartialCharge

Maximum absolute partial charge.

python
Descriptors.MaxAbsPartialCharge(mol)

MinAbsPartialCharge

Minimum absolute partial charge.

python
Descriptors.MinAbsPartialCharge(mol)

Fingerprint Density

Measures the density of molecular fingerprints.

FpDensityMorgan1

Morgan fingerprint density at radius 1.

python
Descriptors.FpDensityMorgan1(mol)

FpDensityMorgan2

Morgan fingerprint density at radius 2.

python
Descriptors.FpDensityMorgan2(mol)

FpDensityMorgan3

Morgan fingerprint density at radius 3.

python
Descriptors.FpDensityMorgan3(mol)

PEOE VSA Descriptors

Partial Equalization of Orbital Electronegativities (PEOE) VSA descriptors.

PEOE_VSA1 through PEOE_VSA14

MOE-type descriptors using partial charges and surface area contributions.

python
Descriptors.PEOE_VSA1(mol)
# ... through PEOE_VSA14

SMR VSA Descriptors

Molecular refractivity VSA descriptors.

SMR_VSA1 through SMR_VSA10

MOE-type descriptors using MR contributions and surface area.

python
Descriptors.SMR_VSA1(mol)
# ... through SMR_VSA10

SLogP VSA Descriptors

LogP VSA descriptors.

SLogP_VSA1 through SLogP_VSA12

MOE-type descriptors using LogP contributions and surface area.

python
Descriptors.SLogP_VSA1(mol)
# ... through SLogP_VSA12

EState VSA Descriptors

EState_VSA1 through EState_VSA11

MOE-type descriptors using E-state indices and surface area.

python
Descriptors.EState_VSA1(mol)
# ... through EState_VSA11

VSA Descriptors

van der Waals surface area descriptors.

VSA_EState1 through VSA_EState10

EState VSA descriptors.

python
Descriptors.VSA_EState1(mol)
# ... through VSA_EState10

BCUT Descriptors

Burden-CAS-University of Texas eigenvalue descriptors.

BCUT2D_MWHI

Highest eigenvalue of Burden matrix weighted by molecular weight.

python
Descriptors.BCUT2D_MWHI(mol)

BCUT2D_MWLOW

Lowest eigenvalue of Burden matrix weighted by molecular weight.

python
Descriptors.BCUT2D_MWLOW(mol)

BCUT2D_CHGHI

Highest eigenvalue weighted by partial charges.

python
Descriptors.BCUT2D_CHGHI(mol)

BCUT2D_CHGLO

Lowest eigenvalue weighted by partial charges.

python
Descriptors.BCUT2D_CHGLO(mol)

BCUT2D_LOGPHI

Highest eigenvalue weighted by LogP.

python
Descriptors.BCUT2D_LOGPHI(mol)

BCUT2D_LOGPLOW

Lowest eigenvalue weighted by LogP.

python
Descriptors.BCUT2D_LOGPLOW(mol)

BCUT2D_MRHI

Highest eigenvalue weighted by molar refractivity.

python
Descriptors.BCUT2D_MRHI(mol)

BCUT2D_MRLOW

Lowest eigenvalue weighted by molar refractivity.

python
Descriptors.BCUT2D_MRLOW(mol)

Autocorrelation Descriptors

AUTOCORR2D

2D autocorrelation descriptors (if enabled). Various autocorrelation indices measuring spatial distribution of properties.

MQN Descriptors

Molecular Quantum Numbers - 42 simple descriptors.

mqn1 through mqn42

Integer descriptors counting various molecular features.

python
# Access via CalcMolDescriptors
desc = Descriptors.CalcMolDescriptors(mol)
mqns = {k: v for k, v in desc.items() if k.startswith('mqn')}

QED

qed

Quantitative Estimate of Drug-likeness.

python
Descriptors.qed(mol)

Lipinski's Rule of Five

Check drug-likeness using Lipinski's criteria:

python
def lipinski_rule_of_five(mol):
    mw = Descriptors.MolWt(mol) <= 500
    logp = Descriptors.MolLogP(mol) <= 5
    hbd = Descriptors.NumHDonors(mol) <= 5
    hba = Descriptors.NumHAcceptors(mol) <= 10
    return mw and logp and hbd and hba

Batch Descriptor Calculation

Calculate all descriptors at once:

python
from rdkit import Chem
from rdkit.Chem import Descriptors

mol = Chem.MolFromSmiles('CCO')

# Get all descriptors as dictionary
all_descriptors = Descriptors.CalcMolDescriptors(mol)

# Access specific descriptor
mw = all_descriptors['MolWt']
logp = all_descriptors['MolLogP']

# Get list of available descriptor names
from rdkit.Chem import Descriptors
descriptor_names = [desc[0] for desc in Descriptors._descList]

Descriptor Categories Summary

  1. Physicochemical: MolWt, MolLogP, MolMR, TPSA
  2. Topological: BertzCT, BalabanJ, Kappa indices
  3. Electronic: Partial charges, E-state indices
  4. Shape: Kappa indices, BCUT descriptors
  5. Connectivity: Chi indices
  6. 2D Fingerprints: FpDensity descriptors
  7. Atom counts: Heavy atoms, heteroatoms, rings
  8. Drug-likeness: QED, Lipinski parameters
  9. Flexibility: NumRotatableBonds, HallKierAlpha
  10. Surface area: VSA-based descriptors

Common Use Cases

Drug-likeness Screening

python
def screen_druglikeness(mol):
    return {
        'MW': Descriptors.MolWt(mol),
        'LogP': Descriptors.MolLogP(mol),
        'HBD': Descriptors.NumHDonors(mol),
        'HBA': Descriptors.NumHAcceptors(mol),
        'TPSA': Descriptors.TPSA(mol),
        'RotBonds': Descriptors.NumRotatableBonds(mol),
        'AromaticRings': Descriptors.NumAromaticRings(mol),
        'QED': Descriptors.qed(mol)
    }

Lead-like Filtering

python
def is_leadlike(mol):
    mw = 250 <= Descriptors.MolWt(mol) <= 350
    logp = Descriptors.MolLogP(mol) <= 3.5
    rot_bonds = Descriptors.NumRotatableBonds(mol) <= 7
    return mw and logp and rot_bonds

Diversity Analysis

python
def molecular_complexity(mol):
    return {
        'BertzCT': Descriptors.BertzCT(mol),
        'NumRings': Descriptors.RingCount(mol),
        'NumRotBonds': Descriptors.NumRotatableBonds(mol),
        'FractionCsp3': Descriptors.FractionCsp3(mol),
        'NumAromaticRings': Descriptors.NumAromaticRings(mol)
    }

Tips

  1. Use batch calculation for multiple descriptors to avoid redundant computations
  2. Check for None - some descriptors may return None for invalid molecules
  3. Normalize descriptors for machine learning applications
  4. Select relevant descriptors - not all 200+ descriptors are useful for every task
  5. Consider 3D descriptors separately (require 3D coordinates)
  6. Validate ranges - check if descriptor values are in expected ranges