Virtual Screening for a Class Assignment: Dock a Small Ligand Library
A class “virtual screening” assignment is not a pharma-scale billion-compound campaign. It is a structured SAR exercise: one receptor, one pocket, 15–40 related analogs, ranked by AutoDock Vina, explained with interactions and limitations. This guide covers how to scope the library, budget Dock credits, run one batch job correctly, and present results so instructors see scientific reasoning — not a screenshot of negative numbers.
Related: receptor prep · reading scores & poses · choosing a PDB.
What instructors actually want (vs industry VS)
Industry virtual screening filters millions of compounds, enforces diversity, and validates hits in assays (SBVS review). Coursework instead tests whether you can:
- Pick a defensible receptor and binding site.
- Prepare ligands consistently (protonation, 3D, PDBQT).
- Rank a homologous series and relate scores to substituents.
- Visualize top poses and cite H-bonds / hydrophobic contacts.
- State what computation cannot prove (no fake IC50, no “drug approved”).
20–40 analogs is usually enough for a strong report; 100+ without analysis depth often hurts more than it helps on a one-week deadline.
One-week workflow
Step 1 — Scope the compound library
How many compounds?
| Assignment style | Suggested count | Discussion depth |
|---|---|---|
| Intro med-chem / biochem lab | 10–20 | Full table + 2 figures feasible |
| SAR-focused project | 20–40 | Ideal for substituent trends |
| “Mini screen” rubric | 40–50 | Need strict filtering to top 5 for prose |
| Free Dock tier | ≤ 3 per job | Use for pilot only; upgrade for real series |
Where to get SMILES
- Course handout (NSAID analogs, kinase fragments, natural product derivatives).
- Draw in ChemDraw / Marvin → copy SMILES.
- PubChem / ChEMBL for known actives as positive controls (cite source).
- Include 1–2 decoys (very polar or bulky molecules) if the rubric asks for critical thinking — they should rank poorly with sensible interactions.
Batch input format on Dock
Paste one SMILES per line (lines starting with # are comments). Names in reports are auto-assigned ligand_1, ligand_2, … unless you upload a multi-record SDF with title lines. Duplicate SMILES are rejected at parse time.
# SAR set — protease inhibitors (example)
CCO
CC(=O)Oc1ccccc1C(=O)O
...
Platform hard cap: 300 ligands/job. Plan limits: Free 3 · Student 30 · Research 100 · Lab 300 per batch.
Step 2 — One receptor, one box, one job
Virtual screening here means one rigid receptor, identical box and pH for every ligand — not rotating PDBs per compound.
- Upload PDB once; select chain if prompted.
- Define box from co-crystal ligand (holo) or literature / predicted pocket (apo).
- Review setup — free validation of protein, box, and every ligand prep.
- Enable redock on holo structures; proceed only if co-crystal RMSD is plausible (≤ ~2 Å).
- Submit one batch job — Vina reuses receptor maps for throughput.
Do not run 30 separate single-ligand jobs unless your instructor requires it — you risk inconsistent boxes and wasted credits.
Step 3 — Budget Dock credits
Formula: 1 credit for 1–3 ligands; otherwise ceil(ligands ÷ 5), plus 0.5 if optional PyMOL publication render is on.
| Ligands | Credits (no PyMOL) | With PyMOL (+0.5) |
|---|---|---|
| 10 | 2 | 2.5 |
| 20 | 4 | 4.5 |
| 25 | 5 | 5.5 |
| 30 | 6 | 6.5 |
| 40 | 8 | 8.5 |
| 50 | 10 | 10.5 |
Plans and one-time packs
- Free: 1 credit/month, max 3 ligands/job — pilot 2–3 analogs, then upgrade.
- Student ($12/mo): 10 credits/month, batch up to 30 ligands — typical semester workflow (e.g. two screens of ~25).
- Screening Pack ($19.99 once): 8 credits — sized for ~40 ligands in one project without subscription.
- Single Run ($7.99): 2 credits — small homework (≤10 ligands without PyMOL).
Pricing details: home page pricing. Credits refund only if the service fails to deliver report/ZIP — not when individual ligands fail to dock.
Step 4 — After the run: build the SAR table
affinity_best; discuss chemistry for the top 3–5, not all 30 rows in prose.Download results.zip immediately — storage retention is 7 days on all tiers (signed download links match this window).
Recommended columns
| Column | Why include it |
|---|---|
| Compound ID | ligand_n or SDF name — map to your drawn structures |
| Affinity (kcal/mol) | affinity_best — rank within this screen only |
| Pose confidence | pose_confidence_label — flag ambiguous winners |
| Key interaction | From PLIP / binding_description — SAR narrative |
| Lipinski / Veber | From admet JSON — developability aside, not binding proof |
| Status | Include failed ligands with error hint — academic honesty |
Deep dive on columns: interpreting docking results.
Turning ranks into SAR sentences
Good Discussion links structure to score and interactions:
- “Methyl at R2 (ligand_07) improved affinity vs unsubstituted (ligand_04) while maintaining the Asp189 H-bond.”
- “ortho-Chloro analog (ligand_21) lost the H-bond and failed PoseBusters clash checks despite −6.4 kcal/mol.”
- “Flat SAR plateau (ligands 8–14 within 0.3 kcal/mol) suggests the pocket is insensitive to distal ring changes.”
Weak Discussion: “ligand_12 had the best score.” Strong Discussion: explains why substituents help or hurt in the pocket you visualized.
Step 5 — Figures instructors expect
- Binding site overview — PNG from ZIP
figures/{ligand}/or PyMOL overlay of top hit. - 2D interaction diagram — auto-generated; verify residue numbers against complex PDB.
- SAR grid (optional) — structures of top 3–5 aligned by scaffold, not all 30.
- Redock figure (if holo) — crystal vs redocked ligand overlay, RMSD in caption.
Optional PyMOL render (+0.5 credit/job) gives a publication-style binding-site PNG for the batch top hit — cite as a render, not new data.
Step 6 — Methods and limitations (paste-ready)
A virtual screen of [N] analogs was performed against [target] (PDB [ID], chain [X], rigid receptor, pH 7.4) using AutoDock Vina via Dock (exhaustiveness 8, top three poses exported per ligand). The search grid was centered on [co-crystal ligand / residue anchors / predicted pocket rank 1] with dimensions 20×20×20 Å. Ligands were prepared from SMILES (dimorphite_dl, RDKit ETKDG, MMFF94, Meeko PDBQT). Hits were ranked by Vina affinity (kcal/mol) and reviewed with PLIP interaction analysis and PoseBusters pose QC. Results are computational estimates for SAR discussion; no experimental binding or functional assay was performed.
Common mistakes (lost marks)
| Mistake | Fix |
|---|---|
| Docking before redock passes | Validate holo workflow first |
| Different boxes per ligand | Single batch job, one manifest |
| Reporting only top 10 of 30 | Footnote failures and ambiguous poses |
| Claiming “best drug candidate” | Say “top-ranked computational hit” |
| No 3D figure | Overlay at least two hits + reference |
| Waiting until day 8 to download ZIP | Files expire after 7 days |
| Pasting CSV with names — ignored | Use SDF titles or post-hoc map ligand_n to your notebook |
Pilot → full screen strategy (save credits)
- Free / 3 ligands: test receptor, box, and SMILES parsing with 2 analogs + reference.
- Review setup (0 credits): fix chain, TORSDOF warnings, apo pocket disclaimer.
- Redock (included in validation run): confirm pipeline on co-crystal.
- Full batch: spend credits once on 20–40 lines of SMILES.
- Iterate ligands only: cached protein validation in-session — adjust SMILES without re-uploading PDB.
When to stop screening and start writing
If redock fails, if >30% of ligands fail embedding, or if every analog scores within 0.2 kcal/mol, more compounds will not fix a broken setup. Stop, fix prep (guide), rerun a smaller pilot, then scale up.
Next: step-by-step docking · docking online overview · Vina online tools.