Armet, A. M. et al. Rethinking healthy eating in light of the gut microbiome. Cell Host Microbe 30, 764–785 (2022).
Sharon, G. et al. Specialized metabolites from the microbiome in health and disease. Cell Metab 20, 719–730 (2014).
Baldrian, P. et al. Active and total microbial communities in forest soil are largely different and highly stratified during decomposition. ISME J. 6, 248–258 (2012).
Singleton, C. M. et al. Methanotrophy across a natural permafrost thaw environment. ISME J. 12, 2544–2558 (2018).
Salazar, G. et al. Gene expression changes and community turnover differentially shape the global ocean metatranscriptome. Cell 179, 1068–1083 (2019).
Joice, R., Yasuda, K., Shafquat, A., Morgan, X. C. & Huttenhower, C. Determining microbial products and identifying molecular targets in the human microbiome. Cell Metab. 20, 731–741 (2014).
Zhang, Y. et al. Discovery of bioactive microbial gene products in inflammatory bowel disease. Nature 606, 754–760 (2022).
Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649–662 (2019).
Almeida, A. et al. A unified catalog of 204,938 reference genomes from the human gut microbiome. Nat. Biotechnol. 39, 105–114 (2021).
Peisl, B. Y. L., Schymanski, E. L. & Wilmes, P. Dark matter in host–microbiome metabolomics: tackling the unknowns—a review. Anal. Chim. Acta 1037, 13–27 (2018).
Vanni, C. et al. Unifying the known and unknown microbial coding sequence space. eLife 11, e67667 (2022).
Pavlopoulos, G. A. et al. Unraveling the functional dark matter through global metagenomics. Nature 622, 594–602 (2023).
Browne, H. P. et al. Culturing of ‘unculturable’ human microbiota reveals novel taxa and extensive sporulation. Nature 533, 543–546 (2016).
Lagier, J. C. et al. Culture of previously uncultured members of the human gut microbiota by culturomics. Nat. Microbiol. 1, 16203 (2016).
Almeida, A. et al. A new genomic blueprint of the human gut microbiota. Nature 568, 499–504 (2019).
Schnoes, A. M., Ream, D. C., Thorman, A. W., Babbitt, P. C. & Friedberg, I. Biases in the experimental annotations of protein function and their effect on our understanding of protein function space. PLoS Comput. Biol. 9, e1003063 (2013).
Rost, B., Liu, J., Nair, R., Wrzeszczynski, K. O. & Ofran, Y. Automatic prediction of protein function. Cell Mol. Life Sci. 60, 2637–2650 (2003).
Lee, D., Redfern, O. & Orengo, C. Predicting protein function from sequence and structure. Nat. Rev. Mol. Cell Biol. 8, 995–1005 (2007).
Xin, F. & Radivojac, P. Computational methods for identification of functional residues in protein structures. Curr. Protein Pept. Sci. 12, 456–469 (2011).
Radivojac, P. et al. A large-scale evaluation of computational protein function prediction. Nat. Methods 10, 221–227 (2013).
Jiang, Y. et al. An expanded evaluation of protein function prediction methods shows an improvement in accuracy. Genome Biol. 17, 184 (2016).
Zhou, N. et al. The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens. Genome Biol. 20, 244 (2019).
Jensen, L. J. et al. Prediction of human protein function from post-translational modifications and localization features. J. Mol. Biol. 319, 1257–1265 (2002).
Wass, M. N. & Sternberg, M. J. ConFunc—functional annotation in the twilight zone. Bioinformatics 24, 798–806 (2008).
Clark, W. T. & Radivojac, P. Analysis of protein function and its prediction from amino acid sequence. Proteins 79, 2086–2096 (2011).
You, R. et al. GOLabeler: improving sequence-based large-scale protein function prediction by learning to rank. Bioinformatics 34, 2465–2473 (2018).
Korbel, J. O., Jensen, L. J., von Mering, C. & Bork, P. Analysis of genomic context: prediction of functional associations from conserved bidirectionally transcribed gene pairs. Nat. Biotechnol. 22, 911–917 (2004).
Enault, F., Suhre, K. & Claverie, J. M. Phydbac ‘Gene Function Predictor’: a gene annotation tool based on genomic context analysis. BMC Bioinformatics 6, 247 (2005).
Pellegrini, M., Marcotte, E. M., Thompson, M. J., Eisenberg, D. & Yeates, T. O. Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc. Natl Acad. Sci. USA 96, 4285–4288 (1999).
Engelhardt, B. E., Jordan, M. I., Muratore, K. E. & Brenner, S. E. Protein molecular function prediction by Bayesian phylogenomics. PLoS Comput. Biol. 1, e45 (2005).
Pazos, F. & Sternberg, M. J. Automated prediction of protein function and detection of functional sites from structure. Proc. Natl Acad. Sci. USA 101, 14754–14759 (2004).
Deng, M., Zhang, K., Mehta, S., Chen, T. & Sun, F. Prediction of protein function using protein–protein interaction data. J. Comput. Biol. 10, 947–960 (2003).
Nabieva, E., Jim, K., Agarwal, A., Chazelle, B. & Singh, M. Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps. Bioinformatics 21, i302–i310 (2005).
Wells, J. A. & McClendon, C. L. Reaching for high-hanging fruit in drug discovery at protein–protein interfaces. Nature 450, 1001–1009 (2007).
Brown, M. P. et al. Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc. Natl Acad. Sci. USA 97, 262–267 (2000).
van Noort, V., Snel, B. & Huynen, M. A. Predicting gene function by conserved co-expression. Trends Genet. 19, 238–242 (2003).
Guan, Y. et al. Predicting gene function in a hierarchical context with an ensemble of classifiers. Genome Biol. 9, S3 (2008).
Mostafavi, S., Ray, D., Warde-Farley, D., Grouios, C. & Morris, Q. GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function. Genome Biol. 9, S4 (2008).
Piovesan, D. & Tosatto, S. C. E. INGA 2.0: improving protein function prediction for the dark proteome. Nucleic Acids Res. 47, W373–w378 (2019).
Bashiardes, S., Zilberman-Schapira, G. & Elinav, E. Use of metatranscriptomics in microbiome research. Bioinform. Biol. Insights 10, 19–25 (2016).
Franzosa, E. A. et al. Sequencing and beyond: integrating molecular ‘omics’ for microbial community profiling. Nat. Rev. Microbiol. 13, 360–372 (2015).
Franzosa, E. A. et al. Relating the metatranscriptome and metagenome of the human gut. Proc. Natl Acad. Sci. USA 111, E2329–E2338 (2014).
Heintz-Buschart, A. et al. Integrated multi-omics of the human gut microbiome in a case study of familial type 1 diabetes. Nat. Microbiol. 2, 16180 (2016).
Lloyd-Price, J. et al. Multi-omics of the gut microbial ecosystem in inflammatory bowel diseases. Nature 569, 655–662 (2019).
Coolen, M. J. & Orsi, W. D. The transcriptional response of microbial communities in thawing Alaskan permafrost soils. Front. Microbiol. 6, 197 (2015).
Vorobev, A. et al. Transcriptome reconstruction and functional analysis of eukaryotic marine plankton communities via high-throughput metagenomics and metatranscriptomics. Genome Res. 30, 647–659 (2020).
Lee, H. K., Hsu, A. K., Sajdak, J., Qin, J. & Pavlidis, P. Coexpression analysis of human genes across many microarray data sets. Genome Res. 14, 1085–1094 (2004).
Gaiteri, C., Ding, Y., French, B., Tseng, G. C. & Sibille, E. Beyond modules and hubs: the potential of gene coexpression networks for investigating molecular mechanisms of complex brain disorders. Genes Brain Behav. 13, 13–24 (2014).
Stuart, J. M., Segal, E., Koller, D. & Kim, S. K. A gene-coexpression network for global discovery of conserved genetic modules. Science 302, 249–255 (2003).
van Dam, S., Vosa, U., van der Graaf, A., Franke, L. & de Magalhaes, J. P. Gene co-expression analysis for functional classification and gene–disease predictions. Brief. Bioinform. 19, 575–592 (2018).
Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000).
Zhang, Y. & Maharjan, S. biobakery/fugassem: FUGAsseM v0.3.8. Zenodo https://doi.org/10.5281/zenodo.16477039 (2025).
Hvidsten, T. R., Komorowski, J., Sandvik, A. K. & Laegreid, A. Predicting gene function from gene expressions and ontologies. In Proceedings of the Pacific Symposium on Biocomputing (eds Altman, R. B., Dunker, A. K., Hunter, L., Lauderdale, K. & Klein, T. E.) (World Scientific, 2001).
Zhou, X., Kao, M. C. & Wong, W. H. Transitive functional annotation by shortest-path analysis of gene expression data. Proc. Natl Acad. Sci. USA 99, 12783–12788 (2002).
Myers, C. L., Barrett, D. R., Hibbs, M. A., Huttenhower, C. & Troyanskaya, O. G. Finding function: evaluation methods for functional genomic data. BMC Genomics 7, 187 (2006).
Mitchell, A. et al. The InterPro protein families database: the classification resource after 15 years. Nucleic Acids Res. 43, D213–D221 (2015).
von Mering, C. et al. STRING: known and predicted protein–protein associations, integrated and transferred across organisms. Nucleic Acids Res. 33, D433–D437 (2005).
Suzek, B. E., Huang, H., McGarvey, P., Mazumder, R. & Wu, C. H. UniRef: comprehensive and non-redundant UniProt reference clusters. Bioinformatics 23, 1282–1288 (2007).
Yellaboina, S., Tasneem, A., Zaykin, D. V., Raghavachari, B. & Jothi, R. DOMINE: a comprehensive collection of known and predicted domain-domain interactions. Nucleic Acids Res. 39, D730–D735 (2011).
Kanehisa, M. & Goto, S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 28, 27–30 (2000).
Caspi, R. et al. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res. 42, D459–D471 (2014).
Yao, S. et al. NetGO 2.0: improving large-scale protein function prediction with massive sequence, text, domain, family and network information. Nucleic Acids Res. 49, W469–w475 (2021).
Kulmanov, M. & Hoehndorf, R. DeepGOPlus: improved protein function prediction from sequence. Bioinformatics 37, 1187 (2021).
Rodríguez Del Río, Á. et al. Functional and evolutionary significance of unknown genes from uncultivated taxa. Nature 626, 377–384 (2024).
Mirdita, M. et al. ColabFold: making protein folding accessible to all. Nat. Methods 19, 679–682 (2022).
Varadi, M. et al. AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Res. 50, D439–d444 (2022).
van Kempen, M. et al. Fast and accurate protein structure search with Foldseek. Nat. Biotechnol. 42, 243–246 (2024).
Koppel, N., Maini Rekdal, V. & Balskus, E. P. Chemical transformation of xenobiotics by the human gut microbiota. Science 356, eaag2770 (2017).
Das, N. K. et al. Microbial metabolite signaling is required for systemic iron homeostasis. Cell Metab. 31, 115–130.e116 (2020).
Seyoum, Y., Baye, K. & Humblot, C. Iron homeostasis in host and gut bacteria—a complex interrelationship. Gut Microbes 13, 1–19 (2021).
Chen, Z. et al. The role of intestinal bacteria and gut–brain axis in hepatic encephalopathy. Front. Cell. Infect. Microbiol. 10, 595759 (2020).
Galdiero, S. et al. Microbe–host interactions: structure and role of gram-negative bacterial porins. Curr. Protein Pept. Sci. 13, 843–854 (2012).
Hogbom, M. & Ihalin, R. Functional and structural characteristics of bacterial proteins that bind host cytokines. Virulence 8, 1592–1601 (2017).
Jaehme, M. & Slotboom, D. J. Diversity of membrane transport proteins for vitamins in bacteria and archaea. Biochim. Biophys. Acta 1850, 565–576 (2015).
Fujita, M. et al. A TonB-dependent receptor constitutes the outer membrane transport system for a lignin-derived aromatic compound. Commun. Biol. 2, 432 (2019).
Connors, J., Dawe, N. & van Limbergen, J. The role of succinate in the regulation of intestinal inflammation. Nutrients 11, 25 (2018).
Boudreau, M. A., Fisher, J. F. & Mobashery, S. Messenger functions of the bacterial cell wall-derived muropeptides. Biochemistry 51, 2974–2990 (2012).
Hosaka, H., Kawamura, M., Hirano, T., Hakamata, W. & Nishio, T. Utilization of sucrose and analog disaccharides by human intestinal bifidobacteria and lactobacilli: search of the bifidobacteria enzymes involved in the degradation of these disaccharides. Microbiol. Res. 240, 126558 (2020).
Rawat, P. S., Li, Y., Zhang, W., Meng, X. & Liu, W. Hungatella hathewayi, an efficient glycosaminoglycan-degrading Firmicutes from human gut and its chondroitin ABC exolyase with high activity and broad substrate specificity. Appl. Environ. Microbiol. 88, e0154622 (2022).
Cullender, T. C. et al. Innate and adaptive immunity interact to quench microbiome flagellar motility in the gut. Cell Host Microbe 14, 571–581 (2013).
Lopez-Siles, M., Duncan, S. H., Garcia-Gil, L. J. & Martinez-Medina, M. Faecalibacterium prausnitzii: from microbiology to diagnostics and prognostics. ISME J. 11, 841–852 (2017).
Cornuault, J. K. et al. Phages infecting Faecalibacterium prausnitzii belong to novel viral genera that help to decipher intestinal viromes. Microbiome 6, 65 (2018).
Bai, Z. et al. Comprehensive analysis of 84 Faecalibacterium prausnitzii strains uncovers their genetic diversity, functional characteristics, and potential risks. Front. Cell. Infect. Microbiol. 12, 919701 (2022).
Koropatkin, N. M. & Smith, T. J. SusG: a unique cell-membrane-associated α-amylase from a prominent human gut symbiont targets complex starch molecules. Structure 18, 200–215 (2010).
Martens, E. C. et al. Recognition and degradation of plant cell wall polysaccharides by two human gut symbionts. PLoS Biol. 9, e1001221 (2011).
Wu, M. et al. Genetic determinants of in vivo fitness and diet responsiveness in multiple human gut Bacteroides. Science 350, aac5992 (2015).
Terrapon, N. et al. PULDB: the expanded database of polysaccharide utilization loci. Nucleic Acids Res. 46, D677–d683 (2018).
Pavarina, G. C., Lemos, E. G. M., Lima, N. S. M. & Pizauro, J. M. Jr. Characterization of a new bifunctional endo-1,4-β-xylanase/esterase found in the rumen metagenome. Sci. Rep. 11, 10440 (2021).
Carneiro, L. et al. Selective xyloglucan oligosaccharide hydrolysis by a GH31 α-xylosidase from Escherichia coli. Carbohydr. Polym. 284, 119150 (2022).
Lin, H. et al. Multiomics study reveals Enterococcus and Subdoligranulum are beneficial to necrotizing enterocolitis. Front. Microbiol. 12, 752102 (2021).
Shi, T. T. et al. Comparative assessment of gut microbial composition and function in patients with Graves’ disease and Graves’ orbitopathy. J. Endocrinol. Invest. 44, 297–310 (2021).
Girardin, S. E. et al. Nod1 detects a unique muropeptide from gram-negative bacterial peptidoglycan. Science 300, 1584–1587 (2003).
Hasegawa, M. et al. Differential release and distribution of Nod1 and Nod2 immunostimulatory molecules among bacterial species and environments. J. Biol. Chem. 281, 29054–29063 (2006).
Elshorbagy, A. et al. Amino acid changes during transition to a vegan diet supplemented with fish in healthy humans. Eur. J. Nutr. 56, 1953–1962 (2017).
Dong, Z., Sinha, R. & Richie, J. P. Jr. Disease prevention and delayed aging by dietary sulfur amino acid restriction: translational implications. Ann. N. Y. Acad. Sci. 1418, 44–55 (2018).
Whisstock, J. C. & Lesk, A. M. Prediction of protein function from protein sequence and structure. Q. Rev. Biophys. 36, 307–340 (2003).
Sleator, R. D. & Walsh, P. An overview of in silico protein function prediction. Arch. Microbiol. 192, 151–155 (2010).
Overbeek, R., Fonstein, M., D’Souza, M., Pusch, G. D. & Maltsev, N. The use of gene clusters to infer functional coupling. Proc. Natl Acad. Sci. USA 96, 2896–2901 (1999).
Teichmann, S. A. & Babu, M. M. Conservation of gene co-regulation in prokaryotes and eukaryotes. Trends Biotechnol. 20, 407–410 (2002).
Eisenberg, D., Marcotte, E. M., Xenarios, I. & Yeates, T. O. Protein function in the post-genomic era. Nature 405, 823–826 (2000).
Sharan, R., Ulitsky, I. & Shamir, R. Network-based prediction of protein function. Mol. Syst. Biol. 3, 88 (2007).
Wang, P. I. & Marcotte, E. M. It’s the machine that matters: predicting gene function and phenotype from protein networks. J. Proteomics 73, 2277–2289 (2010).
Ryan, C. J. et al. High-resolution network biology: connecting sequence with function. Nat. Rev. Genet. 14, 865–879 (2013).
Costanzo, M. et al. The genetic landscape of a cell. Science 327, 425–431 (2010).
Costanzo, M. et al. Environmental robustness of the global yeast genetic interaction network. Science 372, eabf8424 (2021).
Serin, E. A., Nijveen, H., Hilhorst, H. W. & Ligterink, W. Learning from co-expression networks: possibilities and challenges. Front. Plant Sci. 7, 444 (2016).
Southard, J. N. Protein analysis using real-time PCR instrumentation: incorporation in an integrated, inquiry-based project. Biochem. Mol. Biol. Educ. 42, 142–151 (2014).
Cantalapiedra, C. P., Hernández-Plaza, A., Letunic, I., Bork, P. & Huerta-Cepas, J. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol. Biol. Evol. 38, 5825–5829 (2021).
Schirmer, M. et al. Dynamics of metatranscription in the inflammatory bowel disease gut microbiome. Nat. Microbiol. 3, 337–346 (2018).
Zhang, Y., Thompson, K. N., Huttenhower, C. & Franzosa, E. A. Statistical approaches for differential expression analysis in metatranscriptomics. Bioinformatics 37, i34–i41 (2021).
Parrow, N. L., Fleming, R. E. & Minnick, M. F. Sequestration and scavenging of iron in infection. Infect. Immun. 81, 3503–3514 (2013).
Sanchez-Jimenez, A., Marcos-Torres, F. J. & Llamas, M. A. Mechanisms of iron homeostasis in Pseudomonas aeruginosa and emerging therapeutics directed to disrupt this vital process. Microb. Biotechnol. 16, 1475–1491 (2023).
Kim, C. S. et al. Seasonal and spatial environmental influence on Opisthorchis viverrini intermediate hosts, abundance, and distribution: insights on transmission dynamics and sustainable control. PLoS Negl. Trop. Dis. 10, e0005121 (2016).
Isobe, K. & Ohte, N. Ecological perspectives on microbes involved in N-cycling. Microbes Environ. 29, 4–16 (2014).
Yi, M. et al. Temporal changes of microbial community structure and nitrogen cycling processes during the aerobic degradation of phenanthrene. Chemosphere 286, 131709 (2022).
Davila, A. M. et al. Intestinal luminal nitrogen metabolism: role of the gut microbiota and consequences for the host. Pharmacol. Res. 68, 95–107 (2013).
Hou, K. et al. Microbiota in health and diseases. Signal. Transduct. Target. Ther. 7, 135 (2022).
Fitzgerald, C. B. et al. Comparative analysis of Faecalibacterium prausnitzii genomes shows a high level of genome plasticity and warrants separation into new species-level taxa. BMC Genomics 19, 931 (2018).
Silas, S. et al. Type III CRISPR–Cas systems can provide redundancy to counteract viral escape from type I systems. eLife 6, e27601 (2017).
Cuiv, P. O. et al. Isolation of genetically tractable most-wanted bacteria by metaparental mating. Sci. Rep. 5, 13282 (2015).
Deutscher, M. P. Degradation of RNA in bacteria: comparison of mRNA and stable RNA. Nucleic Acids Res. 34, 659–666 (2006).
Reck, M. et al. Stool metatranscriptomics: a technical guideline for mRNA stabilisation and isolation. BMC Genomics 16, 494 (2015).
Sczyrba, A. et al. Critical assessment of metagenome interpretation—a benchmark of metagenomics software. Nat. Methods 14, 1063–1071 (2017).
Yue, Q. et al. Functional operons in secondary metabolic gene clusters in Glarea lozoyensis (Fungi, Ascomycota, Leotiomycetes). mBio 6, e00703 (2015).
Friedberg, I. Automated protein function prediction—the genomic challenge. Brief. Bioinform. 7, 225–242 (2006).
Jeffery, C. J. Current successes and remaining challenges in protein function prediction. Front. Bioinform. 3, 1222182 (2023).
Abu-Ali, G. S. et al. Metatranscriptome of human faecal microbial communities in a cohort of adult men. Nat. Microbiol. 3, 356–366 (2018).
Qin, J. et al. A human gut microbial gene catalogue established by metagenomic sequencing. Nature 464, 59–65 (2010).
Li, J. et al. An integrated catalog of reference genes in the human gut microbiome. Nat. Biotechnol. 32, 834–841 (2014).
Franzosa, E. A. et al. Species-level functional profiling of metagenomes and metatranscriptomes. Nat. Methods 15, 962–968 (2018).
Beghini, F. et al. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. eLife 10, e65088 (2021).
Huerta-Cepas, J. et al. eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res. 47, D309–d314 (2019).
Thomas, A. M. et al. Metagenomic analysis of colorectal cancer datasets identifies cross-cohort microbial diagnostic signatures and a link with choline degradation. Nat. Med. 25, 667–678 (2019).
Li, D., Liu, C. M., Luo, R., Sadakane, K. & Lam, T. W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674–1676 (2015).
Hyatt, D. et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11, 119 (2010).
Li, W., Jaroszewski, L. & Godzik, A. Clustering of highly homologous sequences to reduce the size of large protein databases. Bioinformatics 17, 282–283 (2001).
Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).
Zhang, Y. et al. Metatranscriptomics for the human microbiome and microbial community functional profiling. Annu. Rev. Biomed. Data Sci. 4, 279–311 (2021).
Klingenberg, H. & Meinicke, P. How to normalize metatranscriptomic count data for differential expression analysis. PeerJ 5, e3859 (2017).
Yu, G., Wang, L. G., Han, Y. & He, Q. Y. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 16, 284–287 (2012).