The Discovery Pipeline

How We Trace Plants
to Protein Targets

Three steps connect centuries of traditional knowledge to modern drug targets. Here's what happens at each stage — and why the convergence signal works.

The Discovery Funnel

Confidence ≥ 0.7 = composite score: binding evidence × literature co-occurrence × pathway centrality (each 0–1, multiplicative)

Traditional Knowledge

Why convergence across traditions is the first signal

Traditional medicine systems encode thousands of years of empirical observation. When traditions with limited historical cross-pollination document the same plant for the same therapeutic purpose, that convergence is a meaningful pharmacological signal — even accounting for documented trade-route contact, mechanism-level agreement is unlikely to be coincidental.

We aggregate usage data from four ethnobotanical corpora — IMPPAT, TCMBank, ETCM, and the Unani pharmacopeia — cross-referenced by plant name, family, and chemical synonym. Compound–organism pairs are bridged via LOTUS. LOTUS provides chemical structures; usage data comes exclusively from the ethnobotanical corpora. A plant documented across three or more traditions earns elevated priority in the compound pipeline — we require independent documentation, so cross-citations within a single tradition don't count.

Turmeric: 3 independent traditions →

ayurveda

— inflammation
— skin disorders
— digestive aid
— wound healing

tcm

— blood stasis
— pain relief
— jaundice
— menstrual disorders

western

— anti-inflammatory
— antioxidant
— joint health
— digestive support

— teal = convergent theme across traditions

Bioactive Compounds

Scoring druggability from first principles

Every documented plant yields dozens to hundreds of phytochemicals. We profile each compound against Lipinski's Rule of Five — MW ≤ 500, LogP ≤ 5, ≤ 5 H-bond donors, ≤ 10 H-bond acceptors — and compute a Quantitative Estimate of Druglikeness (QED) from eight composite molecular properties.

QED ranges from 0 to 1; median QED for approved oral drugs is ~0.5 (Bickerton et al., 2012). We classify compounds as drug-like (QED ≥ 0.5), lead-like (QED ≥ 0.35, lower MW), or fragment-like (MW < 250 — small, high-efficiency starting points).

ar-Turmerone (QED 0.61) outscores curcumin on drug-likeness, illustrating why systematic profiling surfaces better candidates than literature prominence alone. (Curcumin scores well on QED but is a known PAINS compound — a reminder that no single metric replaces orthogonal screening.)

Curcumin

drug-like

QED

0.52

MW 368.4 LogP 3.2

Bisdemethoxycurcumin

lead-like

QED

0.48

MW 338.4 LogP 2.9

ar-Turmerone

fragment-like

QED

0.61

MW 218.3 LogP 3.5

"When the targets from a plant's compounds align with the pathways its traditions documented — inflammation, neuroprotection, antimicrobial — that overlap is convergence at the molecular level."

The signal the platform is built to surface

III

Target Network

Confidence scoring across three evidence sources

Protein targets are identified by cross-referencing STITCH (chemical–protein interactions from text mining and experiments) and ChEMBL (binding assay data curated from literature). Open Targets contributes pathway centrality scores for identified targets — informing one dimension of the composite confidence, not the compound–target binding signal itself.

For each compound–target pair we compute a composite confidence: binding evidence strength × literature co-occurrence × pathway centrality (each normalized 0–1; multiplicative scoring penalizes weak evidence in any single dimension). Scores above 0.7 indicate high confidence; 0.5–0.7 moderate. Edge weight in the network graph reflects that score.

COX-2

PTGS2 · Inflammation

0.91

NF-κB p65

RELA · Inflammation / Immunity

0.87

AKT1

AKT1 · PI3K / Cancer signaling

0.74

GSK-3β

GSK3B · Neuroprotection / Tau

0.68

The full platform goes deeper — formulation builder, ADMET profiles, scaffold analysis, and clinical pathway linkages.

Partner With Us See the Opportunity License Data

How We Trace Plantsto Protein Targets