Integrated machine learning-based virtual screening and biological evaluation for identification of potential inhibitors against cathepsin K.
Molecular diversity
Cathepsin K is a type of cysteine proteinase that is primarily expressed in osteoclasts and has a key role in the breakdown of bone matrix protein during bone resorption. Many studies suggest that the deficiency of cathepsin K is concomitant with a suppression of osteoclast functioning, therefore rendering the resorptive properties of cathepsin K the most prominent target for osteoporosis. This innovative work has identified a novel anti-osteoporotic agent against Cathepsin K by using a comparison of machine learning and deep learning-based virtual screening followed by their biological evaluation. Out of ten shortlisted compounds, five of the compounds (JFD02945, JFD02944, RJC01981, KM08968 and SB01934) exhibit more than 50% inhibition of the Cathepsin K activity at 0.1 μM concentration and are considered to have a promising inhibitory effect against Cathepsin K. The comprehensive docking, MD simulation, and MM/PBSA investigations affirm the stable and effective interaction of these compounds with Cathepsin K to inhibit its function. Furthermore, the compounds RJC01981, KM08968 and SB01934 are represented to have promising anti-osteoporotic properties for the management of osteoporosis owing to their significantly well predicted ADMET properties.
10.1007/s11030-024-10845-5
Deciphering Cathepsin K inhibitors: a combined QSAR, docking and MD simulation based machine learning approaches for drug design.
SAR and QSAR in environmental research
Cathepsin K (CatK), a lysosomal cysteine protease, contributes to skeletal abnormalities, heart diseases, lung inflammation, and central nervous system and immune disorders. Currently, CatK inhibitors are associated with severe adverse effects, therefore limiting their clinical utility. This study focuses on exploring quantitative structure-activity relationships (QSAR) on a dataset of CatK inhibitors (1804) compiled from the ChEMBL database to predict the inhibitory activities. After data cleaning and pre-processing, a total of 1568 structures were selected for exploratory data analysis which revealed physicochemical properties, distributions and statistical significance between the two groups of inhibitors. PubChem fingerprinting with 11 different machine-learning classification models was computed. The comparative analysis showed the ET model performed well with accuracy values for the training set (0.999), cross-validation (0.970) and test set (0.977) in line with OECD guidelines. Moreover, to gain structural insights on the origin of CatK inhibition, 15 diverse molecules were selected for molecular docking. The CatK inhibitors (1 and 2) exhibited strong binding energies of -8.3 and -7.2 kcal/mol, respectively. MD simulation (300 ns) showed strong structural stability, flexibility and interactions in selected complexes. This synergy between QSAR, docking, MD simulation and machine learning models strengthen our evidence for developing novel and resilient CatK inhibitors.
10.1080/1062936X.2024.2405626
S3 to S3' subsite specificity of recombinant human cathepsin K and development of selective internally quenched fluorescent substrates.
Alves Marcio F M,Puzer Luciano,Cotrin Simone S,Juliano Maria Aparecida,Juliano Luiz,Brömme Dieter,Carmona Adriana K
The Biochemical journal
We have systematically examined the S3 to S3' subsite substrate specificity requirements of cathepsin K using internally quenched fluorescent peptides derived from the lead sequence Abz-KLRFSKQ-EDDnp [where Abz is o -aminobenzoic acid and EDDnp is N -(2,4-dinitrophenyl)ethylenediamine]. We assayed six series of peptides, in which each position except Gln was substituted with various natural amino acids. The results indicated that the S3-S1 subsite requirements are more restricted than those of S1'-S3'. Cathepsin K preferentially accommodates hydrophobic amino acids with aliphatic side chains (Leu, Ile and Val) in the S2 site. Modifications at P1 residues also have a large influence on cathepsin K activity. Positively charged residues (Arg and Lys) represent the best accepted amino acids in this position, although a particular preference for Gly was found as well. Subsite S3 accepted preferentially basic amino acids such as Lys and Arg. A broad range of amino acids was accommodated in the remaining subsites. We further explored the acceptance of a Pro residue in the P2 position by cathepsin K in order to develop specific substrates for the enzyme. Two series of peptides with the general sequences Abz-KXPGSKQ-EDDnp and Abz-KPXGSKQ-EDDnp (where X denotes the position of the amino acid that is altered) were synthesized. The substrates Abz-KPRGSKQ-EDDnp and Abz-KKPGSKQ-EDDnp were cleaved by cathepsin K at the Arg-Gly and Gly-Ser bonds respectively, and have been shown to be specific for cathepsin K when compared with other lysosomal cysteine proteases such as cathepsins L and B and with the aspartyl protease cathepsin D.
10.1042/BJ20030438
The S2 subsites of cathepsins K and L and their contribution to collagen degradation.
Lecaille Fabien,Chowdhury Shafinaz,Purisima Enrico,Brömme Dieter,Lalmanach Gilles
Protein science : a publication of the Protein Society
The exchange of residues 67 and 205 of the S2 pocket of human cysteine cathepsins K and L induces a permutation of their substrate specificity toward fluorogenic peptide substrates. While the cathepsin L-like cathepsin K (Tyr67Leu/Leu205Ala) mutant has a marked preference for Phe, the Leu67Tyr/Ala205Leu cathepsin L variant shows an effective cathepsin K-like preference for Leu and Pro. A similar turnaround of inhibition was observed by using specific inhibitors of cathepsin K [1-(N-Benzyloxycarbonyl-leucyl)-5-(N-Boc-phenylalanyl-leucyl)carbohydrazide] and cathepsin L [N-(4-biphenylacetyl)-S-methylcysteine-(D)-Arg-Phe-beta-phenethylamide]. Molecular modeling studies indicated that mutations alter the character of both S2 and S3 subsites, while docking calculations were consistent with kinetics data. The cathepsin K-like cathepsin L was unable to mimic the collagen-degrading activity of cathepsin K against collagens I and II, DQ-collagens I and IV, and elastin-Congo Red. In summary, double mutations of the S2 pocket of cathepsins K (Y67L/L205A) and L (L67Y/A205L) induce a switch of their enzymatic specificity toward small selective inhibitors and peptidyl substrates, confirming the key role of residues 67 and 205. However, mutations in the S2 subsite pocket of cathepsin L alone without engineering of binding sites to chondroitin sulfate are not sufficient to generate a cathepsin K-like collagenase, emphasizing the pivotal role of the complex formation between glycosaminoglycans and cathepsin K for its unique collagenolytic activity.
10.1110/ps.062666607
Selective inhibition of the collagenolytic activity of human cathepsin K by altering its S2 subsite specificity.
Lecaille Fabien,Choe Youngchool,Brandt Wolfgang,Li Zhenqiang,Craik Charles S,Brömme Dieter
Biochemistry
The primary specificity of papain-like cysteine proteases (family C1, clan CA) is determined by S2-P2 interactions. Despite the high amino acid sequence identities and structural similarities between cathepsins K and L, only cathepsin K is capable of cleaving interstitial collagens in their triple helical domains. To investigate this specificity, we have engineered the S2 pocket of human cathepsin K into a cathepsin L-like subsite. Using combinatorial fluorogenic substrate libraries, the P1-P4 substrate specificity of the cathepsin K variant, Tyr67Leu/Leu205Ala, was determined and compared with those of cathepsins K and L. The introduction of the double mutation into the S2 subsite of cathepsin K rendered the unique S2 binding preference of the protease for proline and leucine residues into a cathepsin L-like preference for bulky aromatic residues. Homology modeling and docking calculations supported the experimental findings. The cathepsin L-like S2 specificity of the mutant protein and the integrity of its catalytic site were confirmed by kinetic analysis of synthetic di- and tripeptide substrates as well as pH stability and pH activity profile studies. The loss of the ability to accept proline in the S2 binding pocket by the mutant protease completely abolished the collagenolytic activity of cathepsin K whereas its overall gelatinolytic activity remained unaffected. These results indicate that Tyr67 and Leu205 play a key role in the binding of proline residues in the S2 pocket of cathepsin K and are required for its unique collagenase activity.
10.1021/bi025638x
Active-site geometry of proteinase K. Crystallographic study of its complex with a dipeptide chloromethyl ketone inhibitor.
Betzel C,Pal G P,Struck M,Jany K D,Saenger W
FEBS letters
Proteinase K (EC 3.4.21.14) from the fungus Tritirachium album Limber is the most active known serine endopeptidase. The sequence of its 275-residue long polypeptide chain and its three-dimensional folding show a high degree of homology with the bacterial subtilisin proteases. Using difference Fourier methods, the binding mode of the synthetic carbobenzoxy-Ala-Ala-chloromethyl ketone inhibitor to the active site of proteinase K was determined. In several cycles of restrained least-squares, the enzyme-inhibitor complex was refined to a current R = 22% for 9400 X-ray diffraction data between 2.2 and 5.0 A resolution. The inhibitor is attached to proteinase K by two covalent bonds: one between the methylene carbon of the inhibitor and N epsilon 2 of the catalytic His 68, the other between the ketone carbon atom of the inhibitor and O gamma of the catalytic Ser 221. In addition, two hydrogen bonds donated by the peptide NH of Ser 221 and by the side chain NH2 of Asn 160 hold the hemiketal O- in the oxyanion hole. The peptide inhibitor is further hydrogen bonded to the proteinase polypeptide chain in a three-stranded antiparallel pleated sheet.
10.1016/0014-5793(86)80307-6