VectorAscent: Generate vector graphics from a textual description

Example

"a painting of an evergreen tree"

python text_to_painting.py --prompt "a painting of an evergreen tree" --num_iter 2500 --use_blob --subdir vit_rn50_useblob

We rely on CLIP for its aligned text and image encoders, and diffvg, a differentiable vector graphics rasterizer. Differentiable rendering allows us to generate raster images from vector paths, but isn't provided textual descriptions. We use CLIP to score the similarity between raster graphics and textual captions. Using gradient ascent, we can then optimize for a vector graphic whose rasterization has high similarity with a user-provided caption, backpropagating through CLIP and diffvg to the vector graphics parameters. This project is partially inspired by Deep Daze, a caption-guided raster graphics generator.

Quick start

Requirements:

torch
torchvision
matplotlib
numpy
scikit-image
clip
diffvg

Install our dependencies and CLIP.

conda install --yes -c pytorch pytorch=1.7.1 torchvision cudatoolkit=11.0
pip install ftfy regex tqdm numpy matplotlib scikit-image
pip install git+https://github.com/openai/CLIP.git

Then follow these instructions to install diffvg.

Generate vector graphics from a textual caption

Related tags

Overview

VectorAscent: Generate vector graphics from a textual description

Example

Quick start

Owner

Ajay Jain

Non-Autoregressive Predictive Coding

A practical and feature-rich paraphrasing framework to augment human intents in text form to build robust NLU models for conversational engines. Created by Prithiviraj Damodaran. Open to pull requests and other forms of collaboration.

ACL'2021: Learning Dense Representations of Phrases at Scale

SummerTime - Text Summarization Toolkit for Non-experts

Code for "Parallel Instance Query Network for Named Entity Recognition", accepted at ACL 2022.

Just Another Telegram Ai Chat Bot Written In Python With Pyrogram.

PIZZA - a task-oriented semantic parsing dataset

Phrase-Based & Neural Unsupervised Machine Translation

glow-speak is a fast, local, neural text to speech system that uses eSpeak-ng as a text/phoneme front-end.

PyTorch implementation of the paper: Text is no more Enough! A Benchmark for Profile-based Spoken Language Understanding

pysentimiento: A Python toolkit for Sentiment Analysis and Social NLP tasks

A large-scale (194k), Multiple-Choice Question Answering (MCQA) dataset designed to address realworld medical entrance exam questions.

Large-scale open domain KNOwledge grounded conVERsation system based on PaddlePaddle

Research code for "What to Pre-Train on? Efficient Intermediate Task Selection", EMNLP 2021

translate using your voice

Refactored version of FastSpeech2

Klexikon: A German Dataset for Joint Summarization and Simplification

Binaural Speech Synthesis

An Explainable Leaderboard for NLP

PyTorch implementation of Tacotron speech synthesis model.