Drews, J. Drug discovery: a historical perspective. Science 287, 1960–1964 (2000).
Jacob, François & Monod, J. Genetic regulatory mechanisms in the synthesis of proteins. J. Mol. Biol. 3, 318–356 (1961).
Glickman, M. H. & Ciechanover, A. The ubiquitin–proteasome proteolytic pathway: destruction for the sake of construction. Phys. Rev. 82, 373–428 (2002).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Rives, A. et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc. Natl Acad. Sci. USA 118, e2016239118 (2021).
Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).
Meier, J. et al. Language models enable zero-shot prediction of the effects of mutations on protein function. In Proc. 35th International Conference on Neural Information Processing Systems (eds Ranzato, M. et al.) (NIPS, 2021).
Rao, R. M. et al. MSA transformer. In Proc. 38th International Conference on Machine Learning (eds Meila, M. & Zhang, T.) (PMLR, 2021).
Elnaggar, A. et al. ProtTrans: toward understanding the language of life through self-supervised learning. IEEE Trans. Pattern Anal. Mach. Intell. 44, 7112–7127 (2021).
Heinzinger, M. et al. Bilingual language model for protein sequence and structure. NAR Genom. Bioinform. 6, lqae150 (2024).
Notin, P. et al. Tranception: protein fitness prediction with autoregressive transformers and inference-time retrieval. In Proc. 39th International Conference on Machine Learning (eds Chaudhuri, K. et al.) (PMLR, 2022).
Nijkamp, E., Ruffolo, J. A., Weinstein, E. N., Naik, N. & Madani, A. ProGen2: exploring the boundaries of protein language models. Cell Syst. 14, 968–978 (2023).
Ferruz, N., Schmidt, S. & Höcker, B. ProtGPT2 is a deep unsupervised language model for protein design. Nat. Commun. 13, 4348 (2022).
Su, J. et al. SaProt: orotein language modeling with structure-aware vocabulary. In Proc. 12th International Conference on Learning Representations (ed Kim, B.) (ICLR, 2023).
Mirdita, M. et al. ColabFold: making protein folding accessible to all. Nat. Methods 19, 679–682 (2022).
Hu, E. J. et al. LoRA: low-rank adaptation of large language models. In The Tenth International Conference on Learning Representations https://openreview.net/pdf?id=nZeVKeeFYf9 (ICLR, 2022).
Pfeiffer, J. et al. AdapterHub: a framework for adapting transformers. In Proc. 2020 EMNLP (Systems Demonstrations) https://aclanthology.org/2020.emnlp-demos.7.pdf (Association for Computational Linguistics, 2020).
Kirkpatrick, J. et al. Overcoming catastrophic forgetting in neural networks. Proc. Natl Acad. Sci. USA 114, 3521–3526 (2017).
van Kempen, M. et al. Fast and accurate protein structure search with Foldseek. Nat. Biotechnol. 42, 243–246 (2024).
Hayes, T. et al. Simulating 500 million years of evolution with a language model. Science 387, 850–858 (2025).
Li, M. et al. ProSST: protein language modeling with quantized structure and disentangled attention. In 38th Conference on Neural Information Processing Systems (NeurIPS 2024) https://openreview.net/forum?id=4Z7RZixpJQ&referrer=%5Bthe%20profile%20of%20Bozitao%20Zhong%5D(%2Fprofile%3Fid%3D~Bozitao_Zhong1) (NeurIPS, 2024).
Wang, X. et al. DPLM-2: a multimodal diffusion protein language model. In The Thirteenth International Conference on Learning Representation https://openreview.net/pdf?id=5z9GjHgerY (ICLR, 2025).
Tan, Y., Wang, R., Wu, B., Hong, L. & Zhou, B. Retrieval-enhanced mutation mastery: augmenting zero-shot prediction of protein language model. Preprint at https://arxiv.org/abs/2410.21127 (2024).
Pourmirzaei, M., Esmaili, F., Pourmirzaei, M., Wang, D. & Xu, D. Prot2Token: a multi-task framework for protein language processing using autoregressive language modeling. In ICML 2024 Workshop on Efficient and Accessible Foundation Models for Biological Discovery https://openreview.net/pdf?id=5z9GjHgerY (2024).
Gao, K. et al. Tokenizing 3D molecule structure with quantized spherical coordinates. Preprint at https://arxiv.org/abs/2412.01564 (2024).
Lin, X. et al. Tokenizing foldable protein structures with machine-learned artificial amino-acid vocabulary. Preprint at bioRxiv https://doi.org/10.1101/2023.11.27.568722 (2023).
Ivanisenko, N. V. et al. SEMA 2.0: web-platform for B-cell conformational epitopes prediction using artificial intelligence. Nucleic Acids Res. 52, W533–W539 (2024).
Devlin, J., Chang, Ming-Wei, Lee, K. & Toutanova, K., BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) 4171–4186 (Association for Computational Linguistics, 2019).
Varadi, M. et al. AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Res. 50, D439–D444 (2022).
Rao, R. et al. Evaluating protein transfer learning with tape. In Proc. 33rd International Conference on Neural Information Processing Systems (eds Wallach, H. M. et al.) (NIPS, 2019).
Kucera, T., Oliver, C., Chen, D. & Borgwardt, K. ProteinShake: building datasets and benchmarks for deep learning on protein structures. In Proc. 36th International Conference on Neural Information Processing Systems (eds Oh, A. et al.) (NIPS, 2024).
Xu, M. et al. PEER: a comprehensive and multi-task benchmark for protein sequence understanding. In Proc. 35th International Conference on Neural Information Processing Systems (eds Ranzato, M. et al.) (2021).
Hie, B., Zhong, E. D., Berger, B. & Bryson, B. Learning the language of viral evolution and escape. Science 371, 284–288 (2021).
Frazer, J. et al. Disease variant prediction with deep generative models of evolutionary data. Nature 599, 91–95 (2021).
Dauparas, J. et al. Robust deep learning-based protein sequence design using proteinmpnn. Science 378, 49–56 (2022).
Tsuboyama, K. et al. Mega-scale experimental analysis of protein folding stability in biology and design. Nature 620, 434–444 (2023).
Notin, P. et al. ProteinGym: large-scale benchmarks for protein fitness prediction and design. In Proc. 36th International Conference on Neural Information Processing Systems (eds Oh, A. et al.) (NIPS, 2024).
Landrum, M. J. et al. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 46, D1062–D1067 (2018).
Tan, Y. et al. VenusX: unlocking fine-grained functional understanding of proteins. Preprint at https://arxiv.org/abs/2505.11812 (2025).
Yan, S. et al. Protap: a benchmark for protein modeling on realistic downstream applications. Preprint at https://arxiv.org/abs/2506.02052 (2025).
Zhou, Z. et al. Enhancing efficiency of protein language models with minimal wet-lab data through few-shot learning. Nat. Commun. 15, 5566 (2024).
Dai, F. et al. Toward de novo protein design from natural language. Preprint at bioRxiv https://doi.org/10.1101/2024.08.01.606258 (2024).
Meshchaninov, V. et al. Diffusion on language model encodings for protein sequence generation Preprint at https://arxiv.org/abs/2403.03726 (2024).
Sagawa, T., Kanao, E., Ogata, K., Imami, K. & Ishihama, Y. Prediction of protein half-lives from amino acid sequences by protein language models. Preprint at bioRxiv https://doi.org/10.1101/2024.09.10.612367 (2024).
Bushuiev, A. et al. Training on test proteins improves fitness, structure, and function prediction. Preprint at https://arxiv.org/abs/2411.02109 (2024).
Zhuang, X. et al. Advancing biomolecular understanding and design following human instructions. Nat. Mach. Intell. 7, 1154–1167 (2025).
Zhou, X. et al. Decoding the molecular language of proteins with Evola. Preprint at bioRxiv https://doi.org/10.1101/2025.01.05.630192 (2025).
Wang, L., Zhang, X., Wang, Y. & Xue, Z. SSAlign: ultrafast and sensitive protein structure search at scale. Preprint at bioRxiv https://doi.org/10.1101/2025.07.03.662911 (2025).
Meng, Z., Meng, Z. & Ounis, I. FusionDTI: fine-grained binding discovery with token-level fusion for drug-target interaction. Preprint at https://arxiv.org/abs/2406.01651 (2024).
McNutt, A. T. et al. Scaling structure aware virtual screening to billions of molecules with sprint. Preprint at https://arxiv.org/abs/2411.15418 (2025).
He, Y. et al. Protein language models-assisted optimization of a uracil-N-glycosylase variant enables programmable T-to-G and T-to-C base editing. Mol. Cell 84, 1257–1270 (2024).
Riesselman, A. J., Ingraham, J. B. & Marks, D. S. Deep generative models of genetic variation capture the effects of mutations. Nat. Methods 15, 816–822 (2018).
Hsu, C. et al. Learning inverse folding from millions of predicted structures. In Proc. 39th International Conference on Machine Learning (eds Chaudhuri, K. et al.) (PMLR, 2022).
Sledzieski, S. et al. Democratizing protein language models with parameter-efficient fine-tuning. Proc. Natl Acad. Sci. USA 121, e2405840121 (2024).
Zeng, S., Wang, D. & Xu, D. PEFT-SP: parameter-efficient fine-tuning on large protein language models improves signal peptide prediction. Genome Res. 34, 1445–1454 (2024).
Sledzieski, S., Kshirsagar, M., Berger, B., Dodhia, R. & Ferres, J. L. Parameter-efficient fine-tuning of protein language models improves prediction of protein-protein interactions. In Machine Learning for Structural Biology Workshop, NeurIPS 2023 https://www.mlsb.io/papers_2023/Parameter-Efficient_Fine-Tuning_of_Protein_Language_Models_Improves_Prediction_of_Protein-Protein_Interactions.pdf (2023).
Wang, D. et al. S-PLM: structure-aware protein language model via contrastive learning between sequence and structure. Adv. Sci. 12, 2404212 (2025).
Su, J., Zhou, X., Zhang, X. & Yuan, F. A trimodal protein language model enables advanced protein searches. Nat. Biotechnol. https://doi.org/10.1038/s41587-025-02836-0 (2025).
van den Oord, A. et al. Neural discrete representation learning. In Proc. 30th International Conference on Neural Information Processing Systems (eds Lee, D. D., von Luxburg, U., Garnett, R., Sugiyama, M. & Guyon, I.) (NIPS, 2017).
Gong, L. et al. Efficient training of BERT by progressively stacking. In Proc. 36th International Conference on Machine Learning (eds Chaudhuri, K. & Salakhutdinov, R.) (PMLR, 2019).
Loshchilov, I. & Hutter, F. Fixing weight decay regularization in Adam. Preprint at OpenReview https://openreview.net/forum?id=rk6qdGgCZ (2018).
Yang, K. K., Zanichelli, N. & Yeh, H. Masked inverse folding with sequence transfer for protein representation learning. Protein Eng. Des. Sel. 36, gzad015 (2023).
Zhang, Z. et al. Protein representation learning by geometric structure pretraining. In First Workshop of Pre-training: Perspectives, Pitfalls, and Paths Forward at ICML 2022 https://openreview.net/pdf?id=V5MEFikiBQy (2023).
Dallago, C. et al. FLIP: benchmark tasks in fitness landscape inference for proteins. In Proc. Neural Information Processing Systems Track on Datasets and Benchmarks https://openreview.net/pdf?id=p2dMLEwL8tF (2021).
Almagro Armenteros, J. J., Sønderby, C. K., Sønderby, S. K., Nielsen, H. & Winther, O. DeepLoc: prediction of protein subcellular localization using deep learning. Bioinformatics 33, 3387–3395 (2017).
Hu, M. et al. Exploring evolution-aware & -free protein language models as protein function predictors. In Proc. 35th International Conference on Neural Information Processing Systems (eds Oh, A. et al.) (NIPS, 2023).
Gligorijević, V. et al. Structure-based protein function prediction using graph convolutional networks. Nat. Commun. 12, 3168 (2021).
Orengo, C. A. et al. CATH—a hierarchic classification of protein domain structures. Structure 5, 1093–1109 (1997).
Ingraham, J., Garg, V., Barzilay, R. & Jaakkola, T. Generative models for graph-based protein design. In Proc. 32nd International Conference on Neural Information Processing Systems (eds Bengio, S., Wallach, H. M., Larochelle, H., Grauman, K. & Cesa-Bianchi, N.) (NIPS, 2018).
Houlsby, N. et al. Parameter-efficient transfer learning for NLP. In Proc. 36th International Conference on Machine Learning (eds Chaudhuri, K. & Salakhutdinov, R.) (PMLR, 2019).
Fu, J. et al. Exploring adapter-based transfer learning for recommender systems: empirical studies and practical insights. In Proc. 17th ACM International Conference on Web Search and Data Mining (eds Angélica, L., Lattanzi, S. & Muñoz Medina, A.) (ACM, 2024).
Yuan, F., He, X., Karatzoglou, A. & Zhang, L. Parameter-efficient transfer from sequential behaviors for user modeling and recommendation. In Proc. 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (eds Huang, J., Chang, Y. & Cheng, X.) (ACM, 2020).
Schreiber, A. ESMBind and QBind: LoRA, QLoRA, and ESM-2 for predicting binding sites and post translational modification. Preprint at bioRxiv https://doi.org/10.1101/2023.11.13.566930 (2023).
Schmirler, R., Heinzinger, M. & Rost, B. Fine-tuning protein language models boosts predictions across diverse tasks. Nat. Commun. 15, 7407 (2024).
Karimi Mahabadi, R., Henderson, J. & Ruder, S. COMPACTER: efficient low-rank hypercomplex adapter layers. In Proc. 35th International Conference on Neural Information Processing Systems (eds Oh, A. et al.) (NIPS, 2023).
Fu, L. et al. Critical Assessment of Protein Engineering (CAPE): a student challenge on the cloud. ACS Synth. Biol. 13, 3782–3787 (2024).
He, Y., Zhou, X., Yuan, F. & Chang, X. Protocol to use protein language models predicting and following experimental validation of function-enhancing variants of thymine-N-glycosylase. STAR Protoc. 5, 103188 (2024).






