Connecting chemical and protein sequence space to predict biocatalytic reactions – Nature

-


  • Li, J., Amatuni, A. & Renata, H. Recent advances in the chemoenzymatic synthesis of bioactive natural products. Curr. Opin. Chem. Biol. 55, 111–118 (2020).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Romero, E. et al. Enzymatic late-stage modifications: better late than never. Angew. Chem. Int. Ed. 60, 16824–16855 (2021).

    Article 
    CAS 

    Google Scholar
     

  • Bayer, T., Wu, S., Snajdrova, R., Baldenius, K. & Bornscheuer, U. T. An update: enzymatic synthesis for industrial applications. Angew. Chem. Int. Ed. 64, e202505976 (2025).

    Article 
    CAS 

    Google Scholar
     

  • Marshall, J. R., Mangas-Sanchez, J. & Turner, N. J. Expanding the synthetic scope of biocatalysis by enzyme discovery and protein engineering. Tetrahedron 82, 131926 (2021).

    Article 
    CAS 

    Google Scholar
     

  • Yang, J., Li, F.-Z. & Arnold, F. H. Opportunities and challenges for machine learning-assisted enzyme engineering. ACS Cent. Sci. 10, 226–241 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Bell, E. L. et al. Biocatalysis. Nat. Rev. Methods Primers 1, 46 (2021).

    Article 
    CAS 

    Google Scholar
     

  • Buller, R. et al. From nature to industry: harnessing enzymes for biocatalysis. Science 382, eadh8615 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Garzón-Posse, F., Becerra-Figueroa, L., Hernández-Arias, J. & Gamba-Sánchez, D. Whole cells as biocatalysts in organic transformations. Molecules 23, 1265 (2018).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Tibrewal, N. & Tang, Y. Biocatalysts for natural product biosynthesis. Annu. Rev. Chem. Biomol. Eng. 5, 347–366 (2014).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Roiban, G.-D. et al. Development of an enzymatic process for the production of (R)-2-butyl-2-ethyloxirane. Org. Process Res. Dev. 21, 1302–1310 (2017).

    Article 
    CAS 

    Google Scholar
     

  • Arnold, F. H. Directed evolution: bringing new chemistry to life. Angew. Chem. Int. Ed. 57, 4143–4148 (2018).

    Article 
    CAS 

    Google Scholar
     

  • Tobin, P. H., Richards, D. H., Callender, R. A. & Wilson, C. J. Protein engineering: a new frontier for biological therapeutics. Curr. Drug. Metab. 15, 743–756 (2014).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Novick, S. J. et al. Engineering an amine transaminase for the efficient production of a chiral sacubitril precursor. ACS Catal. 11, 3762–3770 (2021).

    Article 
    CAS 

    Google Scholar
     

  • Lovelock, S. L. et al. The road to fully programmable protein catalysis. Nature 606, 49–58 (2022).

    Article 
    ADS 
    CAS 
    PubMed 

    Google Scholar
     

  • Yu, T. et al. Enzyme function prediction using contrastive learning. Science 379, 1358–1363 (2023).

    Article 
    ADS 
    CAS 
    PubMed 

    Google Scholar
     

  • Hon, J. et al. EnzymeMiner: automated mining of soluble enzymes with diverse structures, catalytic properties and stabilities. Nucleic Acids Res. 48, W104–W109 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Schnoes, A. M., Brown, S. D., Dodevski, I. & Babbitt, P. C. Annotation error in public databases: misannotation of molecular function in enzyme superfamilies. PLoS Comput. Biol. 5, e1000605 (2009).

    Article 
    ADS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Robertson, D. E. et al. Exploring nitrilase sequence space for enantioselective catalysis. Appl. Environ. Microbiol. 70, 2429–2436 (2004).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Wahler, D., Badalassi, F., Crotti, P. & Reymond, J.-L. Enzyme fingerprints by fluorogenic and chromogenic substrate arrays. Angew. Chem. Int. Ed. 40, 4457–4460 (2001).

    Article 
    CAS 

    Google Scholar
     

  • Finnigan, W., Hepworth, L. J., Flitsch, S. L. & Turner, N. J. RetroBioCat as a computer-aided synthesis planning tool for biocatalytic reactions and cascades. Nat. Catal. 4, 98–104 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Fansher, D. J., Besna, J. N., Fendri, A. & Pelletier, J. N. Choose your own adventure: a comprehensive database of reactions catalyzed by cytochrome P450 BM3 variants. ACS Catal. 14, 5560–5592 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Ma, E. J. et al. Machine-directed evolution of an imine reductase for activity and stereoselectivity. ACS Catal. 11, 12433–12445 (2021).

    Article 
    CAS 

    Google Scholar
     

  • Ao, Y.-F. et al. Structure- and data-driven protein engineering of transaminases for improving activity and stereoselectivity. Angew. Chem. Int. Ed. 62, e202301660 (2023).

    Article 
    CAS 

    Google Scholar
     

  • Supekar, S. et al. A machine learning-guided approach to navigate the substrate activity scope of galactose oxidase: application in the conversion of pharmaceutically relevant bulky secondary alcohols. ACS Catal. 14, 17233–17243 (2024).

    Article 
    CAS 

    Google Scholar
     

  • King, B. R., Sumida, K. H., Caruso, J. L., Baker, D. & Zalatan, J. G. Computational stabilization of a non-heme iron enzyme enables efficient evolution of new function. Angew. Chem. Int. Ed. 64, e202414705 (2025).

    Article 
    CAS 

    Google Scholar
     

  • Mou, Z. et al. Machine learning-based prediction of enzyme substrate scope: application to bacterial nitrilases. Proteins 89, 336–347 (2021).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Yang, M. et al. Functional and informatics analysis enables glycosyltransferase activity prediction. Nat. Chem. Biol. 14, 1109–1117 (2018).

    Article 
    ADS 
    CAS 
    PubMed 

    Google Scholar
     

  • Kroll, A., Ranjan, S., Engqvist, M. K. M. & Lercher, M. J. A general model to predict small molecule substrates of enzymes based on machine and deep learning. Nat. Commun. 14, 2787 (2023).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Goldman, S., Das, R., Yang, K. K. & Coley, C. W. Machine learning modeling of family wide enzyme-substrate specificity screens. PLoS Comput. Biol. 18, e1009853 (2022).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Wang, X., Quinn, D., Moody, T. S. & Huang, M. ALDELE: all-purpose deep learning toolkits for predicting the biocatalytic activities of enzymes. J. Chem. Inf. Model. 64, 3123–3139 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Busch, F., Brummund, J., Calderini, E., Schürmann, M. & Kourist, R. Cofactor generation cascade for α-ketoglutarate and Fe(II)-dependent dioxygenases. ACS Sustain. Chem. Eng. 8, 8604–8612 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Zwick, C. R. & Renata, H. Harnessing the biocatalytic potential of iron- and α-ketoglutarate-dependent dioxygenases in natural product total synthesis. Nat. Prod. Rep. 37, 1065–1079 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Gao, S. S., Naowarojna, N., Cheng, R., Liu, X. & Liu, P. Recent examples of α-ketoglutarate-dependent mononuclear non-haem iron enzymes in natural product biosyntheses. Nat. Prod. Rep. 35, 792–837 (2018).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Hausinger, R. P. Fe(II)/α-ketoglutarate-dependent hydroxylases and related enzymes. Crit. Rev. Biochem. Mol. Biol. 39, 21–68 (2004).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • McLean, K. J., Luciakova, D., Belcher, J., Tee, K. L. & Munro, A. W. Biological diversity of cytochrome P450 redox partner systems. Adv. Exp. Med. Biol. 851, 299–317 (2015).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Schofield, C. J. & Zhang, Z. Structural and mechanistic studies on 2-oxoglutarate-dependent oxygenases and related enzymes. Curr. Opin. Struct. Biol. 9, 722–731 (1999).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Seide, S. et al. From enzyme to preparative cascade reactions with immobilized enzymes: tuning Fe(II)/α-ketoglutarate-dependent lysine hydroxylases for application in biotransformations. Catalysts 12, 354 (2022).

    Article 
    CAS 

    Google Scholar
     

  • Hegg, E. L. & Que, L. Jr The 2-His-1-carboxylate facial triad — an emerging structural motif in mononuclear non-heme iron(II) enzymes. Eur. J. Biochem. 250, 625–629 (1997).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Zallot, R., Oberg, N. & Gerlt, J. A. The EFI web resource for genomic enzymology tools: leveraging protein, genome, and metagenome databases to discover novel enzymes and metabolic pathways. Biochemistry 58, 4169–4182 (2019).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Fisher, B. F., Snodgrass, H. M., Jones, K. A., Andorfer, M. C. & Lewis, J. C. Site-selective C–H halogenation using flavin-dependent halogenases identified via family-wide activity profiling. ACS Cent. Sci. 5, 1844–1856 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Atkinson, H. J., Morris, J. H., Ferrin, T. E. & Babbitt, P. C. Using sequence similarity networks for visualization of relationships across diverse protein superfamilies. PLoS One 4, e4345 (2009).

    Article 
    ADS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Copp, J. N., Akiva, E., Babbitt, P. C. & Tokuriki, N. Revealing unexplored sequence-function space using sequence similarity networks. Biochemistry 57, 4651–4662 (2018).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Pyser, J. B. et al. Stereodivergent, chemoenzymatic synthesis of azaphilone natural products. J. Am. Chem. Soc. 141, 18551–18559 (2019).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Lima, S. T. et al. A widely distributed biosynthetic cassette is responsible for diverse plant side chain cross-linked cyclopeptides. Angew. Chem. Int. Ed. 62, e202218082 (2023).

    Article 
    CAS 

    Google Scholar
     

  • Ju, S. et al. A biocatalytic platform for asymmetric alkylation of α-keto acids by mining and engineering of methyltransferases. Nat. Commun. 14, 5704 (2023).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Jacot-Descombes, L., Turcani, L. & Jorner, K. morfeus (computer software). https://github.com/digital-chemistry-laboratory/morfeus (accessed 29 August 2025).

  • Ropp, P. J., Kaminsky, J. C., Yablonski, S. & Durrant, J. D. Dimorphite-DL: an open-source program for enumerating the ionization states of drug-like small molecules. J. Cheminform. 11, 14 (2019).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Hastie, T., Tibshirani, R. & Friedman, J. in The Elements of Statistical Learning: Data Mining, Inference, and Prediction 605–624 (Springer, 2009).

  • Lyzhin, I., Ustimenko, A., Gulin, A. & Prokhorenkova, L. Which tricks are important for learning to rank? Proc. 40th Intl Conf. Machine Learning (ICML 2023), PMLR 202, 23264–23278 (2023).

  • Bentéjac, C., Csörgő, A. & Martínez-Muñoz, G. A comparative analysis of gradient boosting algorithms. Artif. Intell. Rev. 54, 1937–1967 (2021).

    Article 

    Google Scholar
     

  • Kerkovius, J. K. et al. A pyridine dearomatization approach to the matrine-type lupin alkaloids. J. Am. Chem. Soc. 144, 15938–15943 (2022).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Xu, H., Zhao, J. & Renata, H. Discovery, characterization and synthetic application of a promiscuous nonheme iron biocatalyst with dual hydroxylase/desaturase activity. Angew. Chem. Int. Ed. 63, e202409143 (2024).

    Article 
    CAS 

    Google Scholar
     

  • Bunno, R., Awakawa, T., Mori, T. & Abe, I. Aziridine formation by a FeII/α-ketoglutarate dependent oxygenase and 2-aminoisobutyrate biosynthesis in fungi. Angew. Chem. Int. Ed. 60, 15827–15831 (2021).

    Article 
    CAS 

    Google Scholar
     

  • Paton, A. E. et al. Connecting chemical and protein sequence space to predict biocatalytic reactions (v0.1). Zenodo https://doi.org/10.5281/zenodo.16779318 (2024).



  • Source link

    Latest news

    Government Workers Say Their Out-of-Office Replies Were Forcibly Changed to Blame Democrats for Shutdown

    On Wednesday, the first day of the US government shutdown, employees at the Department of Education (DOE) set...

    How startups could be affected by a prolonged government shutdown

    The U.S. government shutdown could stifle deal flow, freeze visa processing for workers, and cause other problems for...

    Celebrating the partners driving Disrupt’s big ideas, connections, and community

    Tech Zone Daily Disrupt 2025 wouldn’t be possible without the incredible support of our sponsors, who bring world-class...

    Phia’s Phoebe Gates and Sophia Kianni talk consumer AI at Disrupt 2025

    Consumer AI is having its breakout moment — and few startups have captured the spotlight this year quite...

    China Rolls Out Its First Talent Visa as the US Retreats on H-1Bs

    The bottom line is that, unlike the US, China is not a country of immigrants. In 2020, only...

    OpenAI is the world’s most valuable private company after private stock sale

    OpenAI has sold $6.6 billion in shares held by current and former employees, according to a new report...

    Must read

    You might also likeRELATED
    Recommended to you