Cornell Biomedical Knowledge Hub (CBKH)

CBKG integrates data from 18 publicly available biomedical databases. The current version of CBKG contains a total of 2,932,164 entities of 10 types. Specifically, the CBKH includes 22,963 anatomy entities, 18,774 disease entities, 36,522 drug entities, 87,942 gene entities, 2,065,015 molecule entities, 1,361 symptom entities, 4,101 DSI entities, 137,568 DSP entities, 605 TC entities and 2,970 pathway entities. For the relationships in the CBKG (Table 3), there are 100 relation types within 17 kinds of entity pairs, including Anatomy-Gene, Drug-Disease, Drug-Drug, Drug-Gene, Disease-Disease, Disease-Gene, Disease-Symptom, Gene-Gene, DSI-Disease, DSI-Symptom, DSI-Drug, DSI-Anatomy, DSI-DSP, DSI-TC, Disease-Pathway, Drug-Pathway and Gene-Pathway. In total, CBKH contains 49,541,938 relations.

Materials and Methods

Our ultimate goal was to build a biomedical knowledge graph via comprehensively incorporating biomedical knowledge as much as possible. To this end, we collected and integrated 18 publicly available data sources to curate a comprehensive one. Details of the used data resources were listed in Table.

Statistics of CBKH

Entity Type	Number	Included Identifiers
Anatomy	22,963	Uberon ID, BTO ID, MeSH ID, Cell Ontology ID
Disease	18,774	Disease Ontology ID, KEGG ID, PharmGKB ID, MeSH ID, OMIM ID
Drug	36,759	DrugBank ID, KEGG ID, PharmGKB ID, MeSH ID
Gene	87,942	HGNC ID, NCBI ID, PharmGKB ID
Molecule	2,065,015	CHEMBL ID, CHEBI ID
Symptom	1,361	MeSH ID
Dietary Supplement Ingredient	4,101	iDISK ID
Dietary Supplement Product	137,568	iDISK ID
Therapeutic Class	605	iDISK ID, UMLS CUI
Pathway	2,970	Reactome ID, KEGG ID
Total Entities	2,382,309	-

Relation Type	Number
Anatomy-Gene	12,825,270
Drug-Disease	2,711,848
Drug-Drug	2,684,682
Drug-Gene	1,295,088
Disease-Disease	11,072
Disease-Gene	27,541,618
Disease-Symptom	3,357
Gene-Gene	1,605,716
DSI-Symptom	2,093
DSI-Disease	5,134
DSI-Anatomy	4,334
DSP-DSI	689,297
DSI-TC	5,430
Disease-Pathway	1,942
Drug-Pathway	3,231
Gene-Pathway	153,236
Drug-Side Effect	163,206
Total Relations	49,706,554

Licence

The data of CBKG is licensed under the MIT License. The CBKH integrated the data from many resources, and users should consider the licenses for each of them (see the detail in the table).

Cite

@article{su2021cbkh,
  title={CBKH: The Cornell Biomedical Knowledge Hub},
  author={Su, Chang and Hou, Yu and Guo, Winston and Chaudhry, Fayzan and Ghahramani, Gregory and Zhang, Haotan and Wang, Fei},
  journal={medRxiv},
  year={2021},
  publisher={Cold Spring Harbor Laboratory Press}，
  url = {https://www.medrxiv.org/content/10.1101/2021.03.12.21253461v1}
}

CBKH: The Cornell Biomedical Knowledge Hub

Related tags

Overview

Cornell Biomedical Knowledge Hub (CBKH)

Materials and Methods

Statistics of CBKH

Licence

Cite

Owner

PyTorch implementation of VAGAN: Visual Feature Attribution Using Wasserstein GANs

Repository of 3D Object Detection with Pointformer (CVPR2021)

Style transfer, deep learning, feature transform

Code release for SLIP Self-supervision meets Language-Image Pre-training

Pytorch implementation for M^3L

Consistency Regularization for Adversarial Robustness

Camera Distortion-aware 3D Human Pose Estimation in Video with Optimization-based Meta-Learning

A Home Assistant custom component for Lobe. Lobe is an AI tool that can classify images.

Source Code for Simulations in the Publication "Can the brain use waves to solve planning problems?"

Code for "Searching for Efficient Multi-Stage Vision Transformers"

[Open Source]. The improved version of AnimeGAN. Landscape photos/videos to anime

The description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts.

A repository for generating stylized talking 3D and 3D face

A Runtime method overload decorator which should behave like a compiled language

A minimal solution to hand motion capture from a single color camera at over 100fps. Easy to use, plug to run.

Framework that uses artificial intelligence applied to mathematical models to make predictions

Text-to-Image generation

ESGD-M - A stochastic non-convex second order optimizer, suitable for training deep learning models, for PyTorch

Implementations of LSTM: A Search Space Odyssey variants and their training results on the PTB dataset.

Dyalog-apl-docset - Dyalog APL Dash Docset Generator