Image Courtesy of Flickr.
What if an algorithm was capable of predicting disease outcomes? A team of researchers at Yale, including MD/Ph.D. student Manik Kuchroo and Associate Professor of Genetics and Computer Science Smita Krishnaswamy, devised a new visualization tool called Multiscale PHATE. The tool accurately predicted mortality outcomes from fifty-four million blood cells harvested from 163 hospitalized patients with severe COVID-19.
Multiscale PHATE creates abstracted cellular features from datasets to produce visualizations at different scales. There are so many different features and cells that a researcher can analyze, which is why dimensionality reduction—the process by which the number of input variables in a data set is reduced—is important. “Dimensionality reduction condenses the variability in the overall data set down to two to three dimensions, essentially creating a scatter plot of the data,” said Kuchroo. The position of the cells in the scatter plot details how similar or different they are. This allows researchers to identify what immune cell types and subsets are present in a sample, and can help predict outcomes.
Currently, there are other tools, such as t-distributed stochastic neighborhood embedding (tSNE), uniform manifold approximation and projection (UMAP), and principal component analysis (PCA), that also perform dimensionality reduction. However, Multiscale PHATE is a more effective way of learning cell relationships in massive datasets. “This computational tool is a better way to approach your analysis if you have massive data sets because it is scalable and can yield interesting insights across scales,” said Kuchroo.
Multiscale PHATE was showcased on blood samples from patients infected with the novel coronavirus, SARS-CoV-2, measured across different flow cytometry panels. Flow cytometry analyzes the proteins found in cells, and the panels display the results of this analysis.
When the cellular responses to infection with SARS-CoV-2 were analyzed, comparing those of patients that died to those of patients that survived hospitalization, researchers discovered that while immune cells such as T cells are broadly protective against infection, some subsets are not. An enriched T cell type was a subset of Th17 called IFNγ+ GranzymeB+ Th17 cells, which suggests that the Th17 cell subset could be pathogenic, or disease-inducing.
Multiscale PHATE creates cellular groupings, summarizing which cell types and subsets are elevated in patients with adverse outcomes. This enabled the researchers to predict the COVID outcomes of patients and discover that older males were more vulnerable to poor outcomes. While these results help understand the pathogenesis of COVID-19, they also demonstrated the effectiveness of Multiscale PHATE. “This was to show that the features and cellular subgroups we picked out were meaningful,” said Krishnaswamy.
With an increasing number of datasets integrated from patient samples, Multiscale PHATE could become a crucial method in uncovering the biological meaning of datasets from a myriad of diseases. Further studies may find it beneficial to use Multiscale PHATE to analyze data sets and predict disease outcomes, providing pivotal information to aid patients afflicted by these diseases.