SurvTRACE: Transformers for Survival Analysis with Competing Events

Last update: Oct 06, 2022

Overview

⭐ SurvTRACE: Transformers for Survival Analysis with Competing Events

This repo provides the implementation of SurvTRACE for survival analysis. It is easy to use with only the following codes:

from survtrace.dataset import load_data
from survtrace.model import SurvTraceSingle
from survtrace import Evaluator
from survtrace import Trainer
from survtrace import STConfig

# use METABRIC dataset
STConfig['data'] = 'metabric'
df, df_train, df_y_train, df_test, df_y_test, df_val, df_y_val = load_data(STConfig)

# initialize model
model = SurvTraceSingle(STConfig)

# execute training
trainer = Trainer(model)
trainer.fit((df_train, df_y_train), (df_val, df_y_val))

# evaluating
evaluator = Evaluator(df, df_train.index)
evaluator.eval(model, (df_test, df_y_test))

print("done!")

🔥 See the demo

Please refer to experiment_metabric.ipynb and experiment_support.ipynb !

🔥 How to config the environment

Use our pre-saved conda environment!

conda env create --name survtrace --file=survtrace.yml
conda activate survtrace

or try to install from the requirement.txt

pip3 install -r requirements.txt

🔥 How to get SEER data

Go to https://seer.cancer.gov/data/ to ask for data request from SEER following the guide there.
After complete the step one, we should have the following seerstat software for data access. Open it and sign in with the username and password sent by seer.

Use seerstat to open the ./data/seer.sl file, we shall see the following.

Click on the 'excute' icon to request from the seer database. We will obtain a csv file.

move the csv file to ./data/seer_raw.csv, then run the python script process_seer.py, as
```
python process_seer.py
```
we will obtain the processed seer data named seer_processed.csv.

📝 Functions

single event survival analysis
competing events survival analysis
multi-task learning
automatic hyperparameter grid-search

😄 If you find this result interesting, please consider to cite this paper:

@article{wang2021survtrace,
      title={Surv{TRACE}: Transformers for Survival Analysis with Competing Events}, 
      author={Zifeng Wang and Jimeng Sun},
      year={2021},
      eprint={2110.00855},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

SurvTRACE: Transformers for Survival Analysis with Competing Events

Related tags

Overview

⭐ SurvTRACE: Transformers for Survival Analysis with Competing Events

🔥 See the demo

🔥 How to config the environment

🔥 How to get SEER data

📝 Functions

😄 If you find this result interesting, please consider to cite this paper:

Owner

Zifeng

This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Combating Embedding Barrier in Multilingual Models for Low-Resource Language Understanding".

easySpeech is an open-source Python wrapper for google speech to text API that doesn't require PyAudio(So you especially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers

Natural language processing summarizer using 3 state of the art Transformer models: BERT, GPT2, and T5

Reproducing the Linear Multihead Attention introduced in Linformer paper (Linformer: Self-Attention with Linear Complexity)

OceanScript is an Esoteric language used to encode and decode text into a formulation of characters

Code for paper "Which Training Methods for GANs do actually Converge? (ICML 2018)"

Official code repository of the paper Linear Transformers Are Secretly Fast Weight Programmers.

Official source for spanish Language Models and resources made @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).

Code for "Finetuning Pretrained Transformers into Variational Autoencoders"

Simple python code to fix your combo list by removing any text after a separator or removing duplicate combos

Official source for spanish Language Models and resources made @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).

Using Bert as the backbone model for lime, designed for NLP task explanation (sentence pair text classification task)

Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon search, prefix search, and token passing. Implemented in Python.

Official implementation of Meta-StyleSpeech and StyleSpeech

NLP command-line assistant powered by OpenAI

Code for "Parallel Instance Query Network for Named Entity Recognition", accepted at ACL 2022.

Document processing using transformers

ZUNIT - Toward Zero-Shot Unsupervised Image-to-Image Translation

The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)

Artificial Conversational Entity for queries in Eulogio "Amang" Rodriguez Institute of Science and Technology (EARIST)