Researchers from Huazhong University of Science and Technology have developed a new AI-based method called “Meta-Sorter” to improve biome labeling for microbiome samples. The study, published in the journal Environmental Science and Ecotechnology, showcases how Meta-Sorter leverages neural networks and transfer learning techniques to tackle the challenge of incomplete information in the MGnify database.

The Meta-Sorter approach consists of two key steps. First, a neural network model is constructed using over 118,000 microbial samples from 134 biomes and their corresponding biome ontology. This model achieves an impressive average AUROC (Area Under the Receiver Operating Characteristic curve) of 0.896, accurately classifying samples with detailed biome information.

The second step involves using transfer learning with newly introduced samples that have different characteristics. Researchers incorporated 34,209 new samples from 35 biomes, including eight novel ones, into the transfer neural network model. This resulted in an outstanding average AUROC of 0.989, effectively predicting biome information for newly introduced samples labeled as “Mixed biome.”

Meta-Sorter demonstrates an overall accuracy rate of 96.7% in classifying samples with incomplete biome annotations. This breakthrough resolves issues of cascading errors and opens up new possibilities for knowledge discovery in environmental research and other scientific disciplines.

Additionally, Meta-Sorter refines the biome annotations for under-annotated and mis-annotated samples. Its automatic assignment of precise classifications to ambiguous samples provides valuable insights beyond the original literature. The differentiation of samples into specific environmental categories enhances the reliability and validity of research conclusions.

As standardized protocols for data submission and meta-data incorporation continue to develop, Meta-Sorter is expected to revolutionize the analysis and interpretation of microbial community samples. It will enable more accurate and insightful discoveries in the field of microbiome research and beyond.

Източник:

Nan Wang et al, Refining biome labeling for large-scale microbial community samples: Leveraging neural networks and transfer learning, Environmental Science and Ecotechnology (2023). DOI: 10.1016/j.ese.2023.100304