scientific-skills/medchem/references/rules_catalog.md
Comprehensive catalog of all available medicinal chemistry rules, structural alerts, and filters in medchem.
Reference: Lipinski et al., Adv Drug Deliv Rev (1997) 23:3-25
Purpose: Predict oral bioavailability
Criteria:
Usage:
mc.rules.basic_rules.rule_of_five(mol)
Notes:
Reference: Veber et al., J Med Chem (2002) 45:2615-2623
Purpose: Additional criteria for oral bioavailability
Criteria:
Usage:
mc.rules.basic_rules.rule_of_veber(mol)
Notes:
Purpose: Combined drug-likeness assessment
Criteria:
Usage:
mc.rules.basic_rules.rule_of_drug(mol)
Reference: Walters & Murcko, Adv Drug Deliv Rev (2002) 54:255-271
Purpose: Filter out compounds unlikely to be drugs
Criteria:
Usage:
mc.rules.basic_rules.rule_of_reos(mol)
Reference: Johnson et al., J Med Chem (2009) 52:5487-5500
Purpose: Balance lipophilicity and molecular weight
Criteria:
Usage:
mc.rules.basic_rules.golden_triangle(mol)
Notes:
Reference: Oprea et al., J Chem Inf Comput Sci (2001) 41:1308-1315
Purpose: Identify lead-like compounds for optimization
Criteria:
Usage:
mc.rules.basic_rules.rule_of_oprea(mol)
Rationale: Lead compounds should have "room to grow" during optimization
Purpose: Permissive lead-like criteria
Criteria:
Usage:
mc.rules.basic_rules.rule_of_leadlike_soft(mol)
Purpose: Restrictive lead-like criteria
Criteria:
Usage:
mc.rules.basic_rules.rule_of_leadlike_strict(mol)
Reference: Congreve et al., Drug Discov Today (2003) 8:876-877
Purpose: Screen fragment libraries for fragment-based drug discovery
Criteria:
Usage:
mc.rules.basic_rules.rule_of_three(mol)
Notes:
Purpose: Central nervous system drug-likeness
Criteria:
Usage:
mc.rules.basic_rules.rule_of_cns(mol)
Rationale:
Reference: Baell & Holloway, J Med Chem (2010) 53:2719-2740
Purpose: Identify compounds that interfere with assays
Categories:
Usage:
mc.rules.basic_rules.pains_filter(mol)
# Returns True if NO PAINS found
Notes:
Source: Derived from ChEMBL curation and medicinal chemistry literature
Purpose: Flag common problematic structural patterns
Alert Categories:
Reactive Groups
Metabolic Liabilities
Aggregators
Toxicophores
Usage:
alert_filter = mc.structural.CommonAlertsFilters()
has_alerts, details = alert_filter.check_mol(mol)
Return Format:
{
"has_alerts": True,
"alert_details": ["reactive_epoxide", "metabolic_hydrazine"],
"num_alerts": 2
}
Source: Novartis Institutes for BioMedical Research
Purpose: Industrial medicinal chemistry filtering rules
Features:
Usage:
nibr_filter = mc.structural.NIBRFilters()
results = nibr_filter(mols=mol_list, n_jobs=-1)
Return Format: Boolean list (True = passes)
Reference: Based on Eli Lilly medicinal chemistry rules
Source: 275 structural patterns accumulated over 18 years
Purpose: Identify assay interference and problematic functionalities
Mechanism:
Demerit Categories:
High Demerits (>50):
Medium Demerits (20-50):
Low Demerits (5-20):
Usage:
lilly_filter = mc.structural.LillyDemeritsFilters()
results = lilly_filter(mols=mol_list, n_jobs=-1)
Return Format:
{
"demerits": 35,
"passes": True, # (demerits ≤ 100)
"matched_patterns": [
{"pattern": "phenolic_ester", "demerits": 20},
{"pattern": "aniline_derivative", "demerits": 15}
]
}
Purpose: Identify kinase hinge-binding motifs
Common Patterns:
Usage:
group = mc.groups.ChemicalGroup(groups=["hinge_binders"])
has_hinge = group.has_match(mol_list)
Application: Kinase inhibitor design
Purpose: Identify phosphate-binding groups
Common Patterns:
Usage:
group = mc.groups.ChemicalGroup(groups=["phosphate_binders"])
Application: Kinase inhibitors, phosphatase inhibitors
Purpose: Identify electrophilic Michael acceptor groups
Common Patterns:
Usage:
group = mc.groups.ChemicalGroup(groups=["michael_acceptors"])
Notes:
Purpose: Identify generally reactive functionalities
Common Patterns:
Usage:
group = mc.groups.ChemicalGroup(groups=["reactive_groups"])
Define custom structural patterns using SMARTS:
custom_patterns = {
"my_warhead": "[C;H0](=O)C(F)(F)F", # Trifluoromethyl ketone
"my_scaffold": "c1ccc2c(c1)ncc(n2)N", # Aminobenzimidazole
}
group = mc.groups.ChemicalGroup(
groups=["hinge_binders"],
custom_smarts=custom_patterns
)
Recommended filters:
rfilter = mc.rules.RuleFilters(rule_list=["rule_of_five", "pains_filter"])
alert_filter = mc.structural.CommonAlertsFilters()
Recommended filters:
rfilter = mc.rules.RuleFilters(rule_list=["rule_of_oprea"])
nibr_filter = mc.structural.NIBRFilters()
lilly_filter = mc.structural.LillyDemeritsFilters()
Recommended filters:
rfilter = mc.rules.RuleFilters(rule_list=["rule_of_drug", "rule_of_leadlike_strict"])
alert_filter = mc.structural.CommonAlertsFilters()
complexity_filter = mc.complexity.ComplexityFilter(max_complexity=400)
Recommended filters:
rfilter = mc.rules.RuleFilters(rule_list=["rule_of_cns"])
constraints = mc.constraints.Constraints(
tpsa_max=90,
hbd_max=2,
mw_range=(300, 450)
)
Recommended filters:
rfilter = mc.rules.RuleFilters(rule_list=["rule_of_three"])
complexity_filter = mc.complexity.ComplexityFilter(max_complexity=250)
Filters are guidelines, not absolutes:
False Positives (good drugs flagged):
False Negatives (bad compounds passing):
Different contexts require different criteria:
Modern approaches combine rules with ML:
# Rule-based pre-filtering
rule_results = mc.rules.RuleFilters(rule_list=["rule_of_five"])(mols)
filtered_mols = [mol for mol, r in zip(mols, rule_results) if r["passes"]]
# ML model scoring on filtered set
ml_scores = ml_model.predict(filtered_mols)
# Combined decision
final_candidates = [
mol for mol, score in zip(filtered_mols, ml_scores)
if score > threshold
]