A fast hierarchical dimensionality reduction algorithm.

Last update: Dec 12, 2022

Related tags

Overview

h-NNE: Hierarchical Nearest Neighbor Embedding

A fast hierarchical dimensionality reduction algorithm.

h-NNE is a general purpose dimensionality reduction algorithm such as t-SNE and UMAP. It stands out for its speed, simplicity and the fact that it provides a hierarchy of clusterings as part of its projection process. The algorithm is inspired by the FINCH clustering algorithm. For more information on the structure of the algorithm, please look at our corresponding paper in ArXiv:

M. Saquib Sarfraz*, Marios Koulakis*, Constantin Seibold, Rainer Stiefelhagen. Hierarchical Nearest Neighbor Graph Embedding for Efficient Dimensionality Reduction. CVPR 2022.

More details are available in the project documentation.

Installation

The project is available in PyPI. To install run:

pip install hnne

How to use h-NNE

The HNNE class implements the common methods of the sklearn interface.

Simple projection example

import numpy as np
from hnne import HNNE

data = np.random.random(size=(1000, 256))

hnne = HNNE(dim=2)
projection = hnne.fit_transform(data)

Projecting on new points

hnne = HNNE()
projection = hnne.fit_transform(data)

new_data_projection = hnne.transform(new_data)

Demos

The following demo notebooks are available:

Citation

If you make use of this project in your work, it would be appreciated if you cite the hnne paper:

@article{hnne,
  title={Hierarchical Nearest Neighbor Graph Embedding for Efficient Dimensionality Reduction},
  author={M. Saquib Sarfraz, Marios Koulakis, Constantin Seibold, Rainer Stiefelhagen},
  booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2022}
}

If you make use of the clustering properties of the algorithm please also cite:

 @inproceedings{finch,
   author    = {M. Saquib Sarfraz and Vivek Sharma and Rainer Stiefelhagen},
   title     = {Efficient Parameter-free Clustering Using First Neighbor Relations},
   booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
   pages = {8934--8943},
   year  = {2019}
}

A fast hierarchical dimensionality reduction algorithm.

Related tags

Overview

h-NNE: Hierarchical Nearest Neighbor Embedding

Installation

How to use h-NNE

Simple projection example

Projecting on new points

Demos

Citation

Owner

Marios Koulakis

Python api wrapper for JellyFish Lights

Code and checkpoints for training the transformer-based Table QA models introduced in the paper TAPAS: Weakly Supervised Table Parsing via Pre-training.

Use the state-of-the-art m2m100 to translate large data on CPU/GPU/TPU. Super Easy!

Code Implementation of "Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction".

Simple Python library, distributed via binary wheels with few direct dependencies, for easily using wav2vec 2.0 models for speech recognition

A Python module made to simplify the usage of Text To Speech and Speech Recognition.

glow-speak is a fast, local, neural text to speech system that uses eSpeak-ng as a text/phoneme front-end.

In this project, we aim to achieve the task of predicting emojis from tweets. We aim to investigate the relationship between words and emojis.

This project deals with a simplified version of a more general problem of Aspect Based Sentiment Analysis.

A website which allows you to play with the GPT-2 transformer

Sentence boundary disambiguation tool for Japanese texts (日本語文境界判定器)

Rich Prosody Diversity Modelling with Phone-level Mixture Density Network

A collection of GNN-based fake news detection models.

The simple project to separate mixed voice (2 clean voices) to 2 separate voices.

New Modeling The Background CodeBase

Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement

SimCSE: Simple Contrastive Learning of Sentence Embeddings

Top2Vec is an algorithm for topic modeling and semantic search.

Recognition of 38 speech commands in russian. Based on Yandex Cup 2021 ML Challenge: ASR

A minimal code for fairseq vq-wav2vec model inference.