Your browser is unsupported

We recommend using the latest version of IE11, Edge, Chrome, Firefox or Safari.

May 17 2023

5/17 – Jinfeng Zhang, Florida State University

CBQB Seminar (in-person event)

May 17, 2023

12:00 PM - 1:30 PM


COMRB 8175


909 S Wolcott Ave, Chicago, IL 60612

This is an in-person event in room COMRB 8175 on west campus (directions).  You can also watch seminar live here>>


Jinfeng Zhang, PhD
Department of Statistics
Florida State University


Constructing a Large-Scale Biomedical Knowledge Graph and Its Applications in Drug Discovery


In the past few decades, the biomedical research community has acquired a wealth of knowledge, much of which is stored in scientific literature as unstructured text. Converting this text into structured form is crucial for developing new methodologies and applications that can fully utilize this knowledge. To achieve this goal, two basic problems must be addressed: named entity recognition (NER) and relation extraction (RE). NER involves identifying the concepts or entities in texts, such as diseases, genes/proteins, and chemical compounds. RE, on the other hand, aims to extract the relationships between these entities. The information extracted from NER and RE can be used to create knowledge graphs, where nodes represent entities in the text and edges represent their relationships. This presentation will discuss our team's work on the LitCoin NLP Challenge organized by NIH, for which we were awarded first place. Using pipelines developed for the challenge, we processed all PubMed articles and created a large-scale biomedical knowledge graph. The accuracy of this large-scale relation extraction is estimated to be 84%, as determined through manual verification of a sample of the extracted data, which is on par with the levels achieved through manual annotation. We also incorporated relation information from 40 public databases and relations inferred from publicly available genomics datasets. Our knowledge graph consists of over 11 million distinct entities and more than 40 million unique relations. We have developed versatile query functions and knowledge discovery tools for accessing and mining structured data in the knowledge graph. Finally, we will discuss some drug discovery-related applications enabled by this large-scale knowledge graph.


UIC Biomedical Engineering

Date posted

May 10, 2023

Date updated

May 10, 2023