Last week we highlighted KinaFrag, a database of kinase-fragment
complexes. Continuing the theme, this week brings us CovPDB, a database of
high-resolution covalent protein-ligand structures. The database was described by
Stefan Günther
and colleagues at Albert-Ludwigs-Universität Freiburg in an open-access Nucleic
Acids. Res. paper earlier this year.
The researchers downloaded all structures from the protein data
bank (PDB) as of 31 August 2020 and extracted those with covalently bound
ligands refined to at least 2.5 Å resolution. These were
then manually curated to remove cofactors (such as retinal) and crosslinkers. Next,
the chemical structures of the pre-reacted ligands were extracted from the
primary citations. Everything was then combined into an easy-to-use database,
and all the contents can also be downloaded.
CovPDB contains 2,294 unique protein-ligand complexes, with
733 different proteins and 1501 different ligands. A total of 93 different
types of warheads are represented, from exotic (arsine oxide) to conventional
(vinyl carbonyl, including acrylamides). These are further grouped into 21 covalent
mechanisms.
As expected, covalent bonds to cysteine and serine are most
common, with 959 and 830 examples, respectively. Lysine, with 205 representatives,
is a distant third, but I was surprised that various unreactive amino acid
residues such as glycine, valine, and proline also showed up. Closer inspection
revealed that these are N-terminal residues; the ligand reacts with the free
amine. Though these sorts of bonds occur with several drugs, including carfilzomib and
voxelotor, it might be nice to have separate annotations to keep these from
being confused with residues that react exclusively at the side chain.
Browsing by ligand, protein, complex, warhead, covalent mechanism,
or targeted residue is straightforward, as is searching by multiple methods,
including ligand similarity and substructure. Each entry has its own page with
a wealth of information, including an interactive 3D-viewer. Here’s the entry
for one of the Tethering hits that ultimately led to sotorasib.
CovPDB should be especially useful to computational folks looking to build models based on high-quality data, but it's also fun to browse for new ideas and inspiration.
A similar database is CovalentInDB (In=Inhibitor):
ReplyDeletehttp://cadd.zju.edu.cn/cidb/