Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

Last update: Nov 28, 2022

Related tags

Computer Vision PPE

Overview

PPE ✨

Repository for our CVPR'2022 paper:

Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model. Zipeng Xu, Tianwei Lin, Hao Tang, Fu Li, Dongliang He, Nicu Sebe, Radu Timofte, Luc Van Gool, Errui Ding. To appear in CVPR 2022.

Pytorch implementation is at here: zipengxuc/PPE-Pytorch.

Updates

24 Mar 2022: We update our arxiv-version paper.

30 Mar 2022: We have had some changes in releasing the code. Pytorch implementation is now at here: zipengxuc/PPE-Pytorch.

14 Apr 2022: Update our PaddlePaddle inference code in this repository.

To reproduce our results:

Setup:

Install CLIP:

conda install --yes -c pytorch pytorch=1.7.1 torchvision cudatoolkit=<CUDA_VERSION>
pip install ftfy regex tqdm gdown
pip install git+https://github.com/openai/CLIP.git

Download pre-trained models:

The code relies on the PaddleGAN (PaddlePaddle implementation of StyleGAN2). Download the pre-trained StyleGAN2 generator from here.

We provided several pretrained PPE models on here.
Invert real images:

The mapper is trained on latent vectors, so it is necessary to invert images into latent space. To edit human face, StyleCLIP provides the CelebA-HQ that was inverted by e4e: test set.

Usage:

Please first put downloaded pretraiend models and data on ckpt folder.

Inference

In PaddlePaddle version, we only provide inference code to generate editing results:

python mapper/evaluate.py

Reference

@article{xu2022ppe,
author = {Zipeng Xu and Tianwei Lin and Hao Tang and Fu Li and Dongliang He and Nicu Sebe and Radu Timofte and Luc Van Gool and Errui Ding},
title = {Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model},
journal = {arXiv preprint arXiv:2111.13333},
year = {2021}
}

If you have any questions, please contact [email protected]. :)

Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

Related tags

Overview

PPE ✨

Updates

To reproduce our results:

Setup:

Usage:

Inference

Reference

Owner

Zipeng Xu

An application of high resolution GANs to dewarp images of perturbed documents

Programa que viabiliza a OCR (Optical Character Reading - leitura óptica de caracteres) de um PDF.

Opencv face recognition desktop application

Document Layout Analysis

Msos searcher - A half-hearted attempt at finding a magic square of squares

Python-based tools for document analysis and OCR

This repo contains several opencv projects done while learning opencv in python.

LEARN OPENCV IN 3 HOURS USING PYTHON - INCLUDING EXAMPLE PROJECTS

Python Computer Vision Aim Bot for Roblox's Phantom Forces

A machine learning software for extracting information from scholarly documents

POT : Python Optimal Transport

This project proposes a camera vision based cursor control system, using hand moment captured from a webcam through a landmarks of hand by using Mideapipe module

An easy to use an (hopefully useful) captcha solution for pyTelegramBotAPI

Image Recognition Model Generator

Here use convulation with sobel filter from scratch in opencv python .

computer vision, image processing and machine learning on the web browser or node.

Automatically fishes for you while you are afk :)

Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

Satoshi is a discord bot template in python using discord.py that allow you to track some live crypto prices with your own discord bot.

Use Youdao OCR API to covert your clipboard image to text.