Nasopharyngeal Microbiota May Predict Coronavirus Disease 2019 (COVID-19) Patient Outcome Across Different Geographical Cohorts
Abstract
Coronavirus disease 2019 (COVID-19) is a complex disease that causes a variety of symptoms, ranging from mild to life-threatening. Given that treatment varies depending on the level of disease severity, it is critical to identify prognostic markers early on to establish the proper course of treatment. Prior work has highlighted potential predictors of COVID-19 severity, such as age or the gut microbiome, but recent interest has been centered on the nasopharyngeal microbiome given the accessibility of the diagnostic swab tool. However, its association with disease severity remains underexplored across diverse geographical populations. Given that the link between the nasopharyngeal microbiome and COVID-19 severity has been consistently highlighted yet only observed in singular cohorts, we hypothesized that patient outcomes in different populations can be predicted from nasopharyngeal microbial composition. In this study, we initially analyzed 16S rRNA sequencing data from two independent cohorts in Russia and Jordan. Microbial diversity analyses revealed significant differences between distinct COVID-19 disease outcomes within both the Russian and Jordanian populations. Additionally, indicator species analysis, core microbiome profiling, and differential abundance testing, consistently identified Prevotella, Veillonella, and Granulicatella as enriched in more life-threatening cases, while Staphylococcus and Corynebacterium were more common in less severe cases. We then trained a Random Forest classification model using these taxa, which demonstrated higher predictive accuracy (AUC = 0.84) compared to standard demographic predictors. Overall, these findings suggest that the nasopharyngeal microbiome profiles can serve as a reliable biomarker for COVID-19 patient outcomes. However, the decrease in sensitivity for less severe patients in the Jordanian cohort suggests that further work is needed to obtain more generalizable findings.