A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data

Last update: May 08, 2022

Related tags

Overview

A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data

Overview

Clustering analysis is widely utilized in single-cell RNA-sequencing (scRNA-seq) data to discover cell heterogeneity and cell states. While several clustering methods have been developed for scRNA-seq analysis, the clustering results of these methods heavily rely on the number of clusters as prior information. How-ever, it is not easy to know the exact number of cell types, and experienced determination is not always accurate. Here, we have developed ADClust, an auto deep embedding clustering method for scRNA-seq data, which can simultaneously and accurately estimate the number of clusters and cluster cells. Specifically, ADClust first obtain low-dimensional representation through pre-trained autoencoder, and use the representations to cluster cells into micro-clusters. Then, the micro-clusters are compared in be-tween by Dip-test, a statistical test for unimodality, and similar micro-clusters are merged through a designed clustering loss func-tion. This process continues until convergence. By tested on elev-en real scRNA-seq datasets, ADClust outperformed existing meth-ods in terms of both clustering performance and the ability to es-timate the number of clusters. More importantly, our model pro-vides high speed and scalability on large datasets.

Requirements

Please ensure that all the libraries below are successfully installed:

torch 1.7.1
numpy 1.19.2
scipy 1.7.3
scanpy 1.8.1

Installation

You need to compile the dip.c file using a C compiler, and add the path of generated library dip.so into LD_LIBRARY_PATH. For this following commands need to be executed:


gcc -fPIC -shared -o dip.so dip.c

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:./dip.so

Run ADClust

Run on the normalized example data.


python ADClust.py --name Baron_human_normalized

output

The clustering cell labels will be stored in the dir ourtput /dataname_pred.csv.

scRNA-seq Datasets

All datasets can be downloaded at Here

All datasets will be downloaded to: ADClust /data/

Citation

Please cite our paper:


@article{zengys,
  title={A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data},
  author={Yuansong Zeng, Zhuoyi Wei, Fengqi, Zhong,  Zixiang Pan, Yutong Lu, Yuedong Yang},
  journal={biorxiv},
  year={2021}
 publisher={Cold Spring Harbor Laboratory}
}

A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data

Related tags

Overview

A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data

Overview

Requirements

Installation

Run ADClust

Run on the normalized example data.

output

scRNA-seq Datasets

Citation

Owner

AI-Biomed @NSCC-gz

A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Torch and Numpy.

Indonesian Car License Plate Character Recognition using Tensorflow, Keras and OpenCV.

A Bayesian cognition approach for belief updating of correlation judgement through uncertainty visualizations

A boosting-based Multiple Instance Learning (MIL) package that includes MIL-Boost and MCIL-Boost

PyMove is a Python library to simplify queries and visualization of trajectories and other spatial-temporal data

Source codes of CenterTrack++ in 2021 ICME Workshop on Big Surveillance Data Processing and Analysis

Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt

AI Based Smart Exam Proctoring Package

a minimal terminal with python 😎😉

Official Implementation of DE-DETR and DELA-DETR in "Towards Data-Efficient Detection Transformers"

vit for few-shot classification

The codes I made while I practiced various TensorFlow examples

Deep Federated Learning for Autonomous Driving

JAX-based neural network library

Hypernetwork-Ensemble Learning of Segmentation Probability for Medical Image Segmentation with Ambiguous Labels

Deep High-Resolution Representation Learning for Human Pose Estimation

You Only Look One-level Feature (YOLOF), CVPR2021, Detectron2

Jiminy Cricket Environment (NeurIPS 2021)

RoMa: A lightweight library to deal with 3D rotations in PyTorch.

TensorFlow (Python) implementation of DeepTCN model for multivariate time series forecasting.