PhD student receives prestigious NIH grant for gene expression algorithms

UIC Richard and Loan Hill Department of Biomedical Engineering PhD and College of Medicine student Ali Farhat

UIC Richard and Loan Hill Department of Biomedical Engineering PhD and College of Medicine student Ali Farhat recently received the prestigious National Institute of Health’s Ruth L. Kirschstein Individual Predoctoral National Research Service Award F30 for his research into RNA data sequencing.

Farhat’s project is entitled, “Developing a Novel Algorithm to Infer Cellular Trajectories from Single-Cell RNA Sequencing Data in Hematopoiesis.” The F30 grant is designed specifically for MD/PhD students.

Collaborating with Richard and Loan Hill Professor and UIC Distinguished Professor Jie Liang and UIC College of Medicine Department of Biochemistry and Molecular Genetics Assistant Professor Constantinos Chronis, Farhat is working to develop two algorithms in bioinformatics. The first algorithm takes gene expression data from single-cell RNA sequencing and defines cell states or cell clusters based on that data in an unsupervised manner. The second algorithm defines the stochastic or probabilistic pathways that these cell states follow along their trajectory.

“For example, existing algorithms use a lot of parameters, which need to be tuned,” Farhat said. “Other researchers do a lot of trial and error to optimize cell states to match their prior biological knowledge, and this isn’t always the most optimal way, so we proposed using advanced mathematics called applied algebraic topology where we look at the inherent structure of the data in high dimensional space and redefine the cell states both the discrete, continuous, and transition cell states inherent in the data.”

While Liang brings the bioinformatics expertise, Chronis helps with the experimental biology. More specifically, Chronis is focused on understanding the differentiation of the hematopoietic stem cell lineage all the way to erythroid progenitors or how red blood cells come from stem cells.

Liang previously created an accurate chemical master equation (ACME) to model the stochastic dynamics of biochemical reaction networks, including genetic networks, to see the probabilistic pathways. The chemical master equation is a known problem in the field due to its infinite dimensionality. While existing methods simulate or approximate the solution, they’re not as efficient or accurate as Liang’s ACME method.

Current experimental technologies cannot capture 100% of the genes expressed cannot be captured as there’s a limit called technical noise. However, Farhat, Liang, and Chronis hope to computationally model the gene developmental process with ACME by merging both the technical noise with the intrinsic biological noise inherent in genetic networks. They hope to gain a more accurate representation of the development by specifically looking at the scRNA-seq data obtained from the hematopoietic stem cell lineage differentiation process.

“These methods are very hard to apply, and not many people can do this work,” Farhat said. “You need to have a very good grasp of the technological limits or computer limits and the mathematical limits, so I think that puts our team at an advantage when it comes to understanding the true biological processes going on.”

While these methods can be applied to any biological process, Farhat added that he’s interested in applying this research to cancer models as a future oncologist.

“Cancer resistance and development is also stochastic, so it’s very probabilistic and very random,” Farhat said. “Everything is stochastic, so the second aim that I had proposed, the second algorithm, can really help distinguish biological variability from technical noise. The power comes from our ability to combine these different problems into one model and obtain a detailed understanding of the genetic networks activated or repressed in development.”