AI Framework Developed by Mathematicians to Identify Emerging COVID-19 Variants

AI Framework Developed by Mathematicians to Identify Emerging COVID-19 Variants


Researchers at The Universities of Manchester and Oxford have devised an innovative AI framework to detect and monitor new and concerning COVID-19 variants, a development that could also aid in combatting other infections in the future. The framework incorporates dimension reduction techniques and a novel clustering algorithm named CLASSIX, pioneered by mathematicians at The University of Manchester. This cutting-edge framework allows for the swift identification of clusters of viral genomes that may pose a future risk, based on the analysis of vast amounts of data.

The findings of this study, published in the journal PNAS, could enhance traditional methods of tracking viral evolution, like phylogenetic analysis, which typically involve labor-intensive manual processes.

Roberto Cahuantzi, a researcher at The University of Manchester and the lead author of the study, emphasized the significance of this AI framework in the context of the ongoing COVID-19 pandemic. He highlighted the continuous evolution of the virus, leading to the emergence of multiple variants with increased transmissibility, immune evasion capabilities, and severity of illness.

Efforts are now being accelerated to identify and address these worrisome variants, such as alpha, delta, and omicron, at their early stages of emergence. Swift and efficient detection of these variants could enable a proactive response, including targeted vaccine development, potentially preventing their establishment.

Given the high mutation rate and rapid evolution of COVID-19, identifying future problematic strains demands significant effort. The current practice of tracing the evolution of all COVID-19 genomes involves substantial computational and human resources.

The newly developed method automates these tasks, allowing for the processing of 5.7 million high-coverage sequences within one to two days using standard laptop resources. This efficiency represents a significant advancement, making the identification of concerning pathogen strains more accessible to researchers by reducing resource requirements.

Thomas House, a Professor of Mathematical Sciences at The University of Manchester, emphasized the necessity of enhancing data analysis techniques to effectively manage the massive volume of genetic data generated during the pandemic. The proposed method aims to complement human expertise, enabling faster analysis and opening up opportunities for further scientific advancements.

The AI approach involves breaking down COVID-19 genetic sequences into smaller ‘words’ represented as numbers and grouping similar sequences based on their patterns using machine learning. Stefan Güttel, a Professor of Applied Mathematics at the University of Manchester, highlighted the efficiency and interpretability of the CLASSIX clustering algorithm, making it less computationally intensive than conventional methods.

Roberto Cahuantzi underscored the potential of machine learning methods as early alert tools for identifying emerging major variants without the need for complex phylogenetic analyses. While phylogenetics remains essential for understanding viral ancestry, machine learning methods offer scalability and cost-effectiveness, accommodating significantly more

1. Source: Coherent Market Insights, Public sources, Desk research
2. We have leveraged AI tools to mine information and compile it