Powered by Smartsupp
D
Dock
Back to Home/Molecular docking guides & how-to tutorials/Does Molecular Docking Need a Crystal Structure? (Holo, Apo, and Predicted Pockets)
Crystal Structure
PDB
Binding Site
Student FAQ
Molecular Docking
Holo Apo
AlphaFold

Does Molecular Docking Need a Crystal Structure? (Holo, Apo, and Predicted Pockets)

Dock TeamPublished on 6/4/202612 min read

If your assignment says “use a crystallized structure,” that almost never means you must grow crystals in your own lab. It means you need a defensible 3D receptor and a binding site you can justify — usually from the RCSB PDB, sometimes from AlphaFold or a homolog. This guide unpacks the six sub-questions students actually ask on forums and in office hours, with data on when apo structures fail and how to write honest methods.

Short answer (read this first)

No — you do not need your own crystal structure. For most coursework and early virtual screening you need:

  1. A receptor model with coordinates you can cite (PDB ID, AlphaFold accession, or homology template).
  2. A binding site definition (co-crystal ligand, literature residues, or a clearly labeled predicted pocket).
  3. A sanity check that your setup can reproduce a known pose when one exists (redock RMSD ≤ ~2 Å is the usual teaching cutoff).

What you choose — holo (ligand-bound), apo (ligand-free), or predicted — changes pose accuracy more than most students expect, because standard AutoDock Vina keeps the protein rigid.

What people really mean by “need a crystal structure”

On Reddit, Chemistry Stack Exchange, and course Slack channels, “do I need a crystal structure for docking?” collapses into several distinct questions. Answer the one your TA is actually asking:

What they askWhat they worry aboutPractical answer
“Must it be experimentally determined?” Using AlphaFold feels like cheating Many rubrics allow PDB or AF if you cite the source and state limitations; some require “X-ray only” — read the rubric.
“Do I need the ligand still in the PDB file?” Downloading apo by mistake Prefer a holo entry for the same target if available; use the co-crystal ligand to place the box, then remove it for analog docking if required.
“Can I use the apo structure?” Only apo is deposited Yes, with literature-defined or predicted pockets — but expect worse redock statistics; discuss induced fit in limitations.
“Is homology modeling OK?” No PDB for their protein Acceptable for hypothesis-building; template identity and binding-site alignment must be discussed; compare to a holo homolog when possible.
“My instructor said ‘use PDB’ — which file?” Dozens of entries per target Filter by holo, resolution, organism, ligand similarity, and missing residues in the pocket (checklist below).
“Redock failed — is my structure wrong?” RMSD > 2 Å on the native ligand Often wrong box, protonation, or apo pocket geometry — not always “wrong PDB,” but holo or refined receptor may be required.

Holo vs apo: why it matters for rigid docking

Schematic comparing holo structure with open binding pocket and co-crystallized ligand versus apo structure with side chains occluding the pocket
Holo structures already encode a ligand-compatible pocket shape; apo structures may need side-chain motion your rigid receptor model cannot provide.

Holo = protein structure determined with a bound ligand (co-crystal, soaked, or covalently linked). Apo = same or related protein without that ligand in the coordinate file. Binding is often accompanied by local conformational change — side-chain rotamers, loop shifts, even backbone moves. Vina does not sample those motions unless you use specialized flexible-receptor workflows.

Classic virtual-screening guidance: when holo conformers exist, prefer them; apo pockets often have side chains protruding into the cavity, which hurts docking and enrichment (Kontoyianni et al., 2008).

How much worse is apo? (published redock numbers)

Gunaydin and Atilgan systematically compared native-ligand redocking with AutoDock Vina on holo crystals, unrefined apo structures, and MD-refined apo binding sites (J. Chem. Inf. Model. 2021). Average ligand RMSD after aligning binding-site residues:

Bar chart: holo self-dock RMSD about 1.34 angstrom, unrefined apo about 3.65 angstrom, MD-refined apo about 1.97 angstrom on DUD-E subset
DUD-E subset (40 targets). A pose within 2 Å RMSD of the crystal ligand is a common “correct pose” threshold in the literature.
  • Holo self-dock: ~1.34 Å (DUD-E), ~1.36 Å (Gunasekaran set, n=84) — your teaching-lab target zone.
  • Apo without refinement: ~3.65 Å (DUD-E), ~2.90 Å (Gunasekaran) — many poses would fail a 2 Å cutoff.
  • Apo after binding-site MD refinement: ~1.97 Å — approaches holo-like performance but adds work beyond a standard homework Vina run.

Takeaway for students: if a holo PDB exists for your target, use it unless the assignment forces apo analysis. If you must use apo, say explicitly that rigid docking may underestimate binding-site rearrangement and report redock results on a reference ligand when available.

Three structure scenarios (ranked for coursework)

1. Holo experimental structure (gold standard)

Use when: introductory labs, SAR series in one pocket, instructors who say “download from PDB with the inhibitor bound.”

  • Center the grid on the co-crystal ligand (or a known allosteric ligand in the same file).
  • Redock the native ligand before screening analogs — if top pose RMSD is consistently > 2 Å, fix box size, protonation pH, chain ID, or try another PDB conformer before docking 30 analogs.
  • Keep box tight (roughly 20–25 Å per side for drug-like ligands); oversized boxes increase false positives (exhaustiveness & box-size study).

2. Apo crystal + literature-defined binding site

Use when: structural biology papers map the active site on an apo conformation (e.g. catalytic triad, cofactor groove, allosteric helix).

  • Cite the residue list or figure from the primary paper — not a random PyMOL surface.
  • Include a figure showing the box enclosing those residues.
  • Limitations paragraph: rigid receptor; apo side chains may not represent ligand-bound rotamers.

3. Apo + predicted pocket (exploratory / hypothesis)

Use when: no holo for your exact construct, exploratory toxicology target assignment, or “predict binding site” rubric items.

  • fpocket, P2Rank, or transfer box from a holo homolog (same family, similar ligand chemotype).
  • Rank pockets by plausibility (conservation, druggability, literature) — not only by software score.
  • Frame results as computational hypotheses, not validation of biological binding.

Which structure should I pick? (decision flowchart)

Flowchart from reading assignment rubric through holo PDB, apo with known site, predicted pocket, AlphaFold pLDDT check, review setup, and Vina run
Start from the rubric, not from whichever PDB downloads first.

PDB selection checklist (before you dock)

Instructors lose marks for “random PDB choice.” Work through this list on the RCSB structure summary page:

CheckWhy it mattersRule of thumb
Holo vs apoPocket shapeSame target → pick holo with relevant ligand chemotype if available
ResolutionSide-chain trust in the siteX-ray < 2.5 Å preferred for coursework; inspect R-free / Ramachandran
Organism & sequenceBiological relevanceHuman vs mouse vs bacterial — match your assignment story
Mutations / fusion tagsArtificial pocketAvoid engineered constructs unless that is your system
Missing residues in pocketVina cannot invent loopsGap in binding site → pick another conformer or state limitation
Biological assemblyWrong oligomerDownload biological assembly if the active site is at an interface
Multiple models (NMR)Arbitrary choicePrefer X-ray holo when available; ensemble docking is advanced
Several holo structuresConformational ensembleCross-dock or redock native ligands; pick structure with best RMSD at default exhaustiveness (8)

When multiple holo PDBs exist, redocking the native ligand across conformers is a standard way to pick a receptor — structures with lower self-dock RMSD often perform better in pose prediction (CrossDocker / cross-docking literature).

Worked example: HIV-1 protease (PDB 1HSG)

A concrete teaching example many labs reuse:

  1. Search RCSB for 1HSG — HIV-1 protease with saquinavir (holo, classic med-chem target).
  2. Note dimer assembly: assignments often use one chain or the biological dimer — match your rubric.
  3. Define the box from the co-crystal inhibitor; redock saquinavir (or extract SMILES from the ligand) before docking your analog series.
  4. Expect good redock RMSD on holo protease if box and protonation are correct; if RMSD is poor, check whether you stripped waters/cofactors the rubric requires.
  5. Methods sentence: cite PDB 1HSG, resolution, ligand used for grid centering, Vina version, exhaustiveness, pH.

This pattern generalizes: holo PDB → box from co-crystal → redock → analog screen → discuss limitations.

Redock sanity check (non-negotiable on holo structures)

Redocking means docking the known co-crystal ligand back into its pocket. Community and textbook practice treats ligand heavy-atom RMSD ≤ 2 Å vs the crystal pose (after receptor alignment) as a successful pose recovery threshold.

  • Passes redock: your receptor prep, box, and protonation are plausible for analogs in the same site.
  • Fails redock on holo: do not batch-screen 40 analogs yet — troubleshoot box center/size, ionization, tautomers, chain, or try another PDB entry.
  • Fails redock on apo only: may be physics (pocket closed), not just “user error” — document and consider holo homolog or pocket refinement literature.

On Dock, use Review setup (0 credits) to validate chains, pH, and the 3D box before spending credits on a full run.

AlphaFold and homology models

Predicted structures are valid for many exploratory assignments, but global pLDDT is misleading. Docking success correlates more with binding-site regional accuracy than with overall fold confidence (Staszic et al.; Koes et al.).

  • Inspect per-residue pLDDT in the pocket — disordered loops blocking the site (reported for some AF models) invalidate naive docking.
  • Compare AF pocket to a holo homolog: subtle side-chain differences can collapse virtual-screening enrichment even when backbone RMSD looks excellent.
  • Methods: cite AlphaFold DB accession, model version, rigid Vina, and that induced fit was not modeled.
  • If pocket pLDDT is weak, state that predictions are unreliable — do not claim “strong binding” from score alone.

When an experimental crystal structure is effectively required

  • Rubric explicitly requires X-ray or cryo-EM coordinates (no AlphaFold).
  • You claim publication-level structural biology or ligand pose validation without experiment.
  • Large induced-fit or domain motion is central to the hypothesis — rigid Vina on one static frame is insufficient; say so.
  • Binding site sits in a low-confidence predicted or unresolved region.
  • Regulatory or industrial QSAR where receptor provenance is audited.

Methods paragraphs you can adapt

The holo receptor was prepared from PDB entry 1HSG (HIV-1 protease, X-ray, 2.0 Å) with protonation at pH 7.4. The AutoDock Vina search space was centered on the co-crystallized inhibitor used to define the active site. The native ligand was redocked to verify pose recovery (top pose RMSD ≤ 2.0 Å vs crystal coordinates). Analogs were docked with a rigid receptor; exhaustiveness 8; top poses ranked by Vina affinity (kcal/mol).

No holo structure was available for [target]. The apo receptor PDB [ID] was used; the binding site was defined by residues [list] according to [Author, Year]. Pocket placement was verified by visual inspection. Results are presented as computational hypotheses because apo conformations may not represent ligand-bound side-chain conformations under a rigid receptor model.

The receptor model was obtained from the AlphaFold Protein Structure Database ([accession], model v2). Per-residue pLDDT in the binding site ranged from [min] to [max]. Rigid AutoDock Vina docking was performed without induced-fit sampling.

Troubleshooting: structure choice vs software bugs

SymptomLikely structure issueWhat to try
All poses outside pocketBox off-target or apo pocket closedRecenter on co-crystal or literature site; switch to holo PDB
Redock RMSD > 5 Å on holoWrong ligand protonation/tautomers or box too small/largeReview setup; increase exhaustiveness to 8+; verify ligand matches crystal chemistry
Good redock, nonsense analog posesAnalogs too large/branched for pocketCheck strain, 2D interactions, not just affinity rank
Only apo structures existInduced fitCite apo limitation; consider homolog holo for box definition
AF model “looks fine” but poor posesPocket loop or side-chain errorCompare binding-site pLDDT; overlay holo homolog in PyMOL/ChimeraX

References & further reading

  • Gunaydin H, Atilgan AR. Holo protein conformation generation from apo structures by ligand binding site refinement. J. Chem. Inf. Model. 2022. doi:10.1021/acs.jcim.2c00895
  • Kontoyianni M, et al. Recipes for the selection of experimental protein conformations for virtual screening. J. Chem. Inf. Model. 2008. PMC2811216
  • Staszic P, et al. How good are AlphaFold models for docking-based virtual screening? PMC9852548
  • Koes DR, et al. Evaluation of AlphaFold2 structures as docking targets. PMC9794023
  • AutoDock Vina FAQ on accuracy and decoys: official FAQ

Run the workflow online (after you pick a structure)

  1. Download the PDB (or upload your prepared file) to Dock.
  2. Holo: define the box from the co-crystal ligand; apo: from literature or predicted pocket.
  3. Review setup (0 credits) — validate chains, protonation, and box in 3D.
  4. Redock when a reference ligand exists; then dock your SMILES/SDF batch.
  5. Download ZIP + PDF for tables and figures in your report.

Next reads: receptor and ligand preparation · step-by-step docking · interpreting affinity and poses.

Related Posts

AutoDock Vina
PDB
Deep prep guide aligned with Meeko + Vina: PDB cleanup, chain selection, pH 7.4 protonation, binding box rules, SMILES/SDF batch format, TORSDOF warnings, redock checks, and troubleshooting valence/chain errors.
6/4/202610 min read
Molecular Docking
AutoDock Vina
Hub guide for undergrad docking coursework: why students use online Vina, Dock workflow, credits, PDF/ZIP deliverables, learning path to deep-dive blogs, and when local install still wins.
6/4/20266 min read
Docking Results
Binding Affinity
Read Vina affinity and RMSD like a TA: mode tables, gap_to_second pose confidence, PoseBusters QC, PLIP interactions, ADMET flags, redock vs crystal, and honest discussion paragraphs for lab reports.
6/4/20269 min read