Proteins can be remarkably
dynamic, and, as we noted recently, different conformational states can reveal
different pockets for small molecule ligands. But how can one survey and
categorize all the possibilities? In a recent J. Chem. Inf. Model. paper,
Doeke Hekstra and colleagues at Harvard University present a new tool for doing
so.
High-throughput crystallographic
fragment screens are becoming faster and more widely accessible, and the
researchers wondered whether the information from these screens could be used
to map protein conformational landscapes. To do so, they built a Python program
called COLAV, short for COnformational LAndscape Visualization. This
open-source tool can compile data from hundreds of protein coordinate files and
then, for each protein, calculate the dihedral angles between backbone atoms,
the pairwise distances between the alpha-carbon atoms, and the strain.
To a first approximation,
dihedral angles capture local movements, while distances between alpha-carbons capture
global movements, such as the distance between the N-terminus and C-terminus.
Strain measurements are also local but can reveal particularly important features
such as hinge movements. Also, while dihedral and pairwise distances can be
calculated for single proteins, strain measurements are calculated after first
aligning multiple structures.
Having calculated these three parameters
for individual protein structures, COLAV can compare them across the selected
set of structures using principal component analysis (PCA). These comparisons
can reveal clusters with similar dihedral angles, pairwise distances, or
strain.
The researchers provide two case
studies. The first is the metabolic disease target PTP1B, which we recently wrote
about here. This enzyme has been pursued intensively for decades, so the
researchers were able to draw on 163 individual protein structures deposited in
the protein data bank (PDB) as well as 187 structures from a high-throughput
crystallographic fragment screen. PTP1B contains two flexible loops, each of which
adopts one of two conformations, and COLAV successfully segregated all 350
structures into four clusters. Importantly, these four clusters were found whether
the structures were pulled from the PDB (representing experiments conducted across
multiple labs and years) or from the fragment screen, suggesting that a single crystallographic
fragment screen can identify most or all of the conformational states available
to a protein. This is particularly impressive given that most of the fragments
bound in allosteric sites while most of the ligands found in the PDB bound in the active site.
Next, the researchers turned to
the main protease (MPro) of SARS-CoV-2, the subject of intense and successful drug
discovery efforts. They used 656 structures from the PDB and 631 structures
from high-throughput crystallographic screens to perform COLAV analyses. Unlike
PTP1B, discrete conformational clusters were not observed; rather a continuous
band was seen, suggesting that the protein can assume myriad conformations.
Here too though, the fragment screens were able to sample most of the
conformations observed in the PDB.
The fact that a single high-throughput
crystallographic screen can capture the conformations seen in hundreds of hard-won
discrete protein-ligand crystal structures is encouraging, though of course the
paper only describes two case studies. Also, as the researchers note, any
structure that cannot be crystallized is not sampled. Since COLAV is free to
use, it will be fun to see it applied to other proteins.